The goal of this work is to develop a command-line tool able to take commands in natural language and have them executed by Sage, a collection of Computer Algebra packages presented in a uniform way. We present here instructions on how to build the interface and examples of its intended use.
You'll need:
cabal
, as in Haskell platformsage
command. It assumes it's in your PATH)You can get this source version by:
cabal install gf
We can install the other dependencies too by:
cabal install json curl
Checkout the mathematics grammar library from:
svn co svn://molto-project.eu/mgl
This is the active branch. For the fixed one use:
svn co svn://molto-project.eu/tags/D6.2
Go into the mgl/sage directory (D6.2/sage if you're using the fixed branch) and make it:
cd mgl/sage
make
The first time you make it will fail, asking you to make modifications in the Sage installation. Please refer to the installation page.
Now try to build gfsage
again. All these build operations will ask Sage to "rebuild" itself. Be warned that the first rebuild takes some time:
make
The system as been tested in Mac (OS X 10.7) and Linux (Ubuntu).
Run the tool as:
./gfsage english
giving the input language as argument. It will take some seconds to start the server. After that it will reply with some server information and will show the prompt:
sage>
You can then enter your query:
sage> compute the product of the octal number 12 and the binary number 100. (3) 40 answer: it is 40 .
To show that a CAS is actually behind the scene, let's try something symbolic:
sage> compute the greatest common divisor of x and the product of x and y. (4) x answer: it is x .
and compare it with:
sage> compute the greatest common divisor of x and the sum of x and y. (5) 1 answer: it is 1 .
Sage does the right thing in both cases, x and y being unbound numeric variables.
sage> compute the second iterated derivative of the cosine at pi. (6) 1 answer: it is 1 .
Exit the session by issuing CRTL+D: This way the server exits cleanly.
Just another example in a different language:
./gfsage spanish Login into localhost at port 9000 Session ID is c1ef10dfd49e4fdb3214fa6d3a3b9c92 waiting... EmptyBlock 2 finished handshake. Session is c1ef10dfd49e4fdb3214fa6d3a3b9c92 sage> calcula la parte imaginaria de la derivada de la exponencial en pi. (4) 0 answer: es 0 .
More recent examples involving integer literals and integration:
sage> compute the sum of 1, 2, 3, 4 and 5. (3) 15 answer: it is 15 . sage> compute the summation of x when x ranges from 1 to 100. (4) 5050 answer: it is 5050 . sage> compute the integral of the cosine from 0 to the quotient of pi and 2. waiting... (5) 1 answer: it is 1 . sage> compute the integral of the function mapping x to the square root of x from 1 to 2. (6) 4/3*sqrt(2) - 2/3 answer: it is 4 over 3 times the square root of 2 minus the quotient of 2 and 3 .
Use english:
gfsage
Use LANGUAGE:
gfsage LANGUAGE
General invocation:
gfsage [OPTIONS]
where OPTIONS are:
short form | long form | description | |
---|---|---|---|
-h |
--help |
Print usage page | |
-i LANGUAGE |
--input-lang=LANGUAGE |
Make queries in LANGUAGE | |
-o LANGUAGE |
--output-lang=LANGUAGE |
Give answers in LANGUAGE | |
-V LEVEL |
--verbose=LEVEL |
Set the verbosity LEVEL | |
-t FILE |
--test=FILE |
Test samples in FILE | |
-v[VOICE] |
--voice[=VOICE] |
Use voice output. To list voices use ? as VOICE. |
|
-F |
--with-feedback |
Restate the query when answering. |
This condition is signaled by the message:
gfsage: Connecting CurlCouldntConnect
I used a Linux virtual machine to reproduce this condition and find that, sometimes, it takes about 10 retries for the server to catch, but then it stays running ok for hours. My guess is that is related to some timeout limit in the server. Killing the orphaned python processes from the previous retries might help too (killall python
).
realsets.py
is a Sage module to support subsets of the real field consisting of intervals and isolated points and was developed to demonstrate set operations of the MGL Set1
module.
It is based of previous work from Interval1Sage adding integration on real sets and real intervals.
An object in this module consists of a list of disjoint open intervals plus a list of isolated points (not belonging to these intervals). Notice that Infinite
is acceptable as interval bound. Therefore, one can define:
Represent a set that can be the union of some intervals and isolated points. It consists of:
A closed interval:
? RealSet.cc_interval(1,4);
[ 1 :: 4 ]
A single point:
? RealSet.singleton(1)
{1}
Union is supported with intervals and can be nested :
? I = RealSet.co_interval(1, 4)
? J = RealSet.co_interval(4, 5)
? M = RealSet.oc_interval(7, 8)
? I.union(J).union(M)
[ 1 :: 5 [ ∪ ] 7 :: 8 ]
? I.intersection(J)
()
? I.intersection(RealSet.cc_interval(2,5))
[ 2 :: 4 [
Is a point in the set?
? I = RealSet.oo_interval(1, 3)
? 2 in I
True
? 3 in I
False
Is a set discrete (i.e: does not contain intervals)?
? RealSet.oo_interval(0,1).discrete
False
? RealSet(points=(1,2,3)).discrete
True
Size of a discrete is the number of points:
? RealSet(points=range(5)).size
5
? RealSet.oo_interval(0,3).size
+Infinity
A is subset of B
? A = RealSet.oo_interval(0,1)
? B = RealSet.cc_interval(0,1)
? RealSet().subset(A)
True
? B.subset(A)
False
? A.subset(B)
True
? A.subset(A)
True
? A.subset(A, proper=True)
False
Return the infimum (greatest lower bound)
? RealSet(points=range(3)).infimum()
0
? RealSet.oo_interval(1,3).infimum()
1
The opposite of a set: –A = {-x | x ∈ A}
? -RealSet.oo_interval(1,2)
] -2 :: -1 [
Return the supremum (least upper bound)
? RealSet(points=range(3)).supremum()
2
? RealSet.oo_interval(1,3).supremum()
3
The complementary of a set:
? RealSet.oo_interval(2,3).complement()
] -Infinity :: 2 ] ∪ [ 3 :: +Infinity [
? RealSet(points=range(3)).complement()
] 0 :: 1 [ ∪ ] 1 :: 2 [ ∪ ] 2 :: +Infinity [ ∪ ] -Infinity :: 0 [
The set difference of A
and B
: \{x \in A, x\notin B\}
? I = RealSet.oo_interval(2,+Infinity)
? J = RealSet.oo_interval(-Infinity, 5)
? I.setdiff(J)
[ 5 :: +Infinity [
? J.setdiff(I)
] -Infinity :: 2 ]
gfsage
is a prototype to demonstrate two-way natural language communication between a user and a Sage system.
When you invoke the gfsage
command interactively:
The details of these components are given below.
A GF module acts as a post office translating messages between the different parties (nodes) composing a dialog. This section is more a description of a proposed design strategy for a generic postoffice interface based on GF. The actual code implements ideas of this design, but, for instance, it contains no edges or nodes as explicit entities.
gfsage
deals with just 2 agents:
in the case whether the input language is different of the output language, we may consider a third node (the output user).
There is a unique pgf
module containing all GF information for the dialog system to work: Commands.pgf
. Each node has a language (a GF concrete module) assigned: the user uses a natural language (i.e., ComandsEng
for English).
A node reacts to received messages by sending a reply. The chain of messages between two nodes is called a dialog. An active node as the user can start a dialog by sending a message. A passive node, like the Sage system here, just replies to the received messages.
A node can receive:
no_parse
message from the postoffice telling that a previous outgoing message cannot be parsed.is_ambiguous
message from the postoffice related to a previous message sent by the node, specifying that it was ambiguous and carrying additional info for the node to decide among the possible meanings. To respond to this, the node must send a disambiguate
message to the postoffice (see below).A node can send:
disambiguate
message sent in response to an ambiguous message. In this message the node chooses one of the options or aborts the transaction.A regular message between two given nodes corresponds to a fixed GF category. In the case of gfsage
it is Command
for messages traveling from User to Sage and Answer
for messages going the other way.
A regular message from node N1 to node N2 goes through the following steps:
no_parse
message is sent back to the sending node. If it contains more than one entry, an is_ambiguous
message is sent. In the previous cases, the process stops here; Only when the computed set contains just an entry, is this pushed downstream to the node N2.For Sage to work alongside GF, we need a http sever listening to Sage commands and some scripts to set up the environment and respond to the type of queries that can be expressed in the Mathematics Grammar Library, MGL.
A Sage process is started in the background by the start-nb.py
script in -python
mode. This script starts a Sage notebook, as described in Simple server API, listening on port 9000 and up to requests in http format. It also installs a handler for cleanly disposing of the notebook object whenever the parent process terminates.
The parent process sends then an initial request to load some functions and variables that we'll need in the dialog system defined in prelude.sage
and goes into the main evaluation loop.
realsets.py
Set1
module of the MGL. (See the page about it)prelude.sage
OS X has voice output buit-in, usable from the shell by way of the say
command. You can use several voices in English or download more for other languages.
mgl/sage
as described previously.gfsage Use english gfsage LANGUAGE Use this language gfsage [OPTIONS] where OPTIONS are: -h --help print this page -i INPUT --input-lang=INPUT Make queries in LANGUAGE -o OUTPUT --output-lang=OUTPUT Give answers in LANGUAGE -v[VOICE] --voice[=VOICE] use voice output. To list voices use ? as VOICE. -F --with-feedback Restate the query when answering.
The options relevant here are -v
and -F
. Use the first to select voice output. With no argument it will pick the first available voice for the OUTPUT voice selected:
./gfsage -i english -v
Voiced by Agnes
... It will use Agnes as English voice. Notice that if you do not give a -o
option, the OUTPUT language is assume to be the same as the INPUT language.
To list the available voices use:
./gfsage -i english -v?
Agnes, Albert, Alex, Bahh, Bells, Boing, Bruce, Bubbles, Cellos, Daniel, Deranged, Fred, Hysterical, Junior, Kathy, Princess, Ralph, Trinoids, Vicki, Victoria, Whisper, Zarvox
It will list the English voices. To use a specific voice write:
./gfsage -i german -vYannick
Voiced by Yannick
The option -F
is to make the system paraphrase your query on answering. First, get a simple answer:
./gfsage -i english
Login into localhost at port 9000
Session ID is df7ad7c769f2faac68b6bb9489bb97e2
waiting... EmptyBlock 3
sage> compute the factorial of 5.
(4) 120
answer: it is 120 .
... and now the same with paraphrasing:
./gfsage -i english -F
Login into localhost at port 9000
Session ID is 88549994a28940fe0657eb9e506a5e84
waiting... EmptyBlock 3
sage> compute the factorial of 5.
(4) 120
answer: the factorial of 5 is 120 .
So, to experience voice output in its full glory you have to use both -v
and -F
.
To help with regression testing I recently added a test option to gfsage for batch-testing the system by reading dialog samples from a file.
The samples must be in a text file and consist in a sequence of dialogs which are sequences of query/responses to the Sage system. Notice that a dialog might carry a state in the form of assumptions that are asserted or variables that are assigned. In the same way, each dialog is completely independent of the others.
Each dialog starts with a BEGIN
or BEGIN language
line. It specifies the beginning of dialog triplets and the natural language for these triplets. The dialog runs until an END
line. The language specified becomes the current language. Dialogs with no given languages are assumed to be in the current language. At the start of a testing suite, the current language is English.
A triplet is a sequence of 3 lines:
BEGIN spanish calcula el factorial del número octal 11.
362880 es 36280 . END BEGIN english let x be 4 .
compute the sum of x and 5 . 9 it is 9 . compute the sum of it and 5 . 14 it is 14 . END
Notice that blank lines are relevant: they mark that Sage responded nothing to the query. Therefore, it is not allowed to insert blank lines neither between triplets nor dialogs.
gfsage --test
will test the dialogs in and tell about the differences. You got a summary of the results:
Dialog 'compute Gamma....' failed 18 out of 19 dialogs successful.