The role of a grammar or automata is to reduce the uncertainty inherent in the identification of the input data.
We use trivial example to demonstrate the inside-outside algorithm.
Suppose we have data \(\mathbf{O}=\{a,a,b,a,a\}\) generated from generative probabilistic context-free grammar \(\mathbf{G}\) model. Assume we know the grammar \(\mathbf{G}\) , then how does the inside-outside algorithm works?
The grammar \(\mathbf{G}\) can be intuitively described as
\[S \to AbA \\
A \to aA | a\]
Xingzhong Xu
from geek to entrepreneur
Sunday, April 21, 2013
Wednesday, January 23, 2013
Home Network
I recently figure out that the FIOS's IP address is almost for sure not change as long as the router is working properly. Consider my personal requirements and the FIOS quality, use my own computer as a server seems attractive for me. So here I will introduce my home internet topology and the services that set up by myself.
Generally, the following steps are necessary,
- Set your domain name to your home FIOS IP address.
- The crappy FIOS router is not stable, I use secondary router Linksys E3000 with Tomato Firmware. And set E3000's IP address from FIOS to 192.168.1.2.
- Setting E3000's IP to different IP segment, e.g 192.168.2.1. And Port forward the port 20(ssh) and 80(http) from FIOS to E3000
- Install a Linux Server on your old and unused laptop (for me, Thinkpad X61s, 2*1.6GHz, 2G RAM, around 20W) as the home server.
- Allocate a static IP 192.168.2.* for X61s from E3000, and set DNS in Tomato if necessary.
- Port forward the E3000's port 80 and 22 to the X61s.
- Keep doing step 4-6 to set up the other devices in your home network.
Sunday, November 11, 2012
PUT Request in web2py
To my surprise, the parameters in PUT request are not well parsed at web2py. As you probably already knows that request.get_vars and request.post_vars are the most convenient way to access the request parameter. Then what about PUT request? request.put_vars? The answer is no. However, good news is that they are still available in request object.
So here is my dirty method, to make it work,
So here is my dirty method, to make it work,
# Xingzhong for yesgoody # Run @ web2py import urlparse put_vars = dict(urlparse.parse_qsl(request.body.read()))
Saturday, September 22, 2012
Post JSON object to Google App Engine
Using GAE as the server to accept JSON data is a interesting AJAX example. Here is a sample code that I write for yesgoody.
Requirements:
My Google App Engine (Python) want accept JSON post input, then process, finally given it back to the client. Here the client means a jquery page.
Here is my code for server side,
Here is a simple jquery to demonstrate the functionality,
Requirements:
My Google App Engine (Python) want accept JSON post input, then process, finally given it back to the client. Here the client means a jquery page.
Here is my code for server side,
# Xingzhong for yesgoody
# Run @ GAE Server
import webapp2
import json
import datetime
class MainHandler(webapp2.RequestHandler):
def get(self):
self.error(403) #access denied
def post(self):
req = json.loads(self.request.body)
res = {"time":datetime.datetime.now().isoformat()}
res['data'] = req
self.response.headers['Content-Type'] = 'application/json'
self.response.out.write(json.dumps(data))
Here is a simple jquery to demonstrate the functionality,
/*yesgoody debug for post json to server*/
function submit(){
postdata = new Object();
postdata.email = "xu@yesgoody.com";
postdata.name = "Xingzhong";
$.ajax({url: '/comm',
type: 'POST',
contentType: 'application/json',
data: JSON.stringify(postdata),
dataType: 'json',
success: function(response) {
var output = JSON.stringify(response);
console.log(output);
}
});
}
Wednesday, August 8, 2012
Grammar Learning (1)
I will consider the task that unsupervised learning probabilistic context-free grammar (PCFG) from data. The following toy example will demonstrate where I should start up.
Suppose given the following 1-D discrete sequence
\[S = abcabcab\]
What is the proper PCFG $G$ can generalized above $S$?
Intuitively, we could infer such grammar $G$ is,
\[ T_1 \implies ab\]
\[ T_2 \implies T_1cT_2 | T_1 \]
Then we can generalize $S$ as,
\[ S \implies T_2 \implies T_1cT_2 \implies T_1cT_1cT_2 \implies T_1cT_1cT_1 \implies abcabcab\].
We can use parsing tree to visualize it as, [S [T_2 [T_1 [a] [b]] [c] [T_2 [T_1 [a] [b]] [c] [T_2 [T_1 [a] [b]]]]]]
Obviously, parsing as well as grammar is not unique for $S$. We could easily generalize grammar $G'$ that,
\[ T_1 \implies ab \]
\[ T_2 \implies T_1T_2T_1|cT_1c \]
Then we can generalize $S$ as,
\[ S \implies T_2 \implies T_1T_2T_1 \implies T_1cT_1cT_1 \implies abcabcab\]
Graphically, it will be, [S [T_2 [T_1 [a] [b]] [T_2 [c] [T_1 [a] [b]] [c]] [T_1 [a] [b]]]]
Though there will be multiple grammar that may fit this data, let's only focus on above two. The natural question is which is better for data $S$?
By answering this question, let's firstly take a look at the grammar itself. What kinds of sequence $S$ can be generalized based on $G$ and $G'$.
By applying $G$, we can write
\[ ab c ab c ab c ab c ... ab c ab \]
On contrast, the $G'$ can be generalized to
\[ ab ab ... ab c ab ... ab ab\]
Therefore, we can easily distinguish the better model from generative perspective.
In the following, we use python code to demonstrate the above parsing problem.
We combines the above grammars to a single one in cst_grammar.
Therefore, if everything OK, the code should generates the above parsing tree.
The two output trees
[S
[T2
[T1 [A a] [B b]]
[T2 [C c] [T1 [A a] [B b]] [C c]]
[T1 [A a] [B b]]]]
[S
[T3
[T1 [A a] [B b]]
[C c]
[T3 [T1 [A a] [B b]] [C c] [T3 [T1 [A a] [B b]]]]]]
Suppose given the following 1-D discrete sequence
\[S = abcabcab\]
What is the proper PCFG $G$ can generalized above $S$?
Intuitively, we could infer such grammar $G$ is,
\[ T_1 \implies ab\]
\[ T_2 \implies T_1cT_2 | T_1 \]
Then we can generalize $S$ as,
\[ S \implies T_2 \implies T_1cT_2 \implies T_1cT_1cT_2 \implies T_1cT_1cT_1 \implies abcabcab\].
We can use parsing tree to visualize it as, [S [T_2 [T_1 [a] [b]] [c] [T_2 [T_1 [a] [b]] [c] [T_2 [T_1 [a] [b]]]]]]
\[ T_1 \implies ab \]
\[ T_2 \implies T_1T_2T_1|cT_1c \]
Then we can generalize $S$ as,
\[ S \implies T_2 \implies T_1T_2T_1 \implies T_1cT_1cT_1 \implies abcabcab\]
Graphically, it will be, [S [T_2 [T_1 [a] [b]] [T_2 [c] [T_1 [a] [b]] [c]] [T_1 [a] [b]]]]
By answering this question, let's firstly take a look at the grammar itself. What kinds of sequence $S$ can be generalized based on $G$ and $G'$.
By applying $G$, we can write
\[ ab c ab c ab c ab c ... ab c ab \]
On contrast, the $G'$ can be generalized to
\[ ab ab ... ab c ab ... ab ab\]
Therefore, we can easily distinguish the better model from generative perspective.
In the following, we use python code to demonstrate the above parsing problem.
We combines the above grammars to a single one in cst_grammar.
Therefore, if everything OK, the code should generates the above parsing tree.
import nltk
cst_grammar = """
S -> T2 [0.5] | T3 [0.5]
T1 -> A B [1.0]
T2 -> T1 T2 T1 [0.5] | C T1 C [0.5]
T3 -> T1 C T3 [0.5] | T1 [0.5]
A -> 'a' [1.0]
B -> 'b' [1.0]
C -> 'c' [1.0]
"""
grammar = nltk.parse_pcfg (cst_grammar)
case = list ('abcabcab')
parser = nltk.InsideChartParser (grammar)
trees = parser.nbest_parse (case)
for tree in trees:
print tree.pprint (parens = '[]')
tree.draw()
The two output trees
[S
[T2
[T1 [A a] [B b]]
[T2 [C c] [T1 [A a] [B b]] [C c]]
[T1 [A a] [B b]]]]
[S
[T3
[T1 [A a] [B b]]
[C c]
[T3 [T1 [A a] [B b]] [C c] [T3 [T1 [A a] [B b]]]]]]
Subscribe to:
Posts
(
Atom
)

.png)

