Wednesday, December 27, 2006

Keyboard Shortcuts for IE7 and Google

IE7 Keyboard Shortcuts

Google Toolbar Keyboard Shortcuts
Google Desktop Keyboard Shortcuts
Google Gmail Keyboard Shortcuts
Google Calendar Keyboard Shortcuts
Google Reader Keyboard Shortcuts

How to Copy IE7 Search Providers List

I have found the following convenient yet safe way to copy IE7 search providers list from one computer A to another computer B.
(1) On computer A, add all desired search providers to IE7.
(2) On computer A, use regedit to find and export the following key to a .reg file: HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\SearchScopes.
(3) Copy that .reg file from computer A to computer B.
(4) On computer B, run (double-click) that .reg file to import the corresponding registry key.

References
[1] Add Search Providers to Internet Explorer 7
[2] How to create custom .adm or .admx files to add search providers to the toolbar search box in Internet Explorer 7

Thursday, October 26, 2006

Am I a nerd?

Take this Nerd Test to find out how nerdy you are. I think it could actually serve as an interesting example for classification or regression.

Custom Search Engine

Google has a new service called Custom Search Engine as prat of their Google Co-op plan. It enables avdanced users to create a highly specialized vertical search engine that reflects their own needs as well as knowledge. There have already been some interesting applications, such as Tech Stuff Search, Product Search and Machine Learning Search.

Sunday, October 15, 2006

Information Geometry

Information Geometry is a very beautiful theory that connects probabilistic/statistical models and differential geometry. Just read a physicist's interesting note on Information Geometry. It should be able to bring new insights into the field of machine learning.

"Beautiful is the first test: there is no permanent place in the world for ugly mathematics." -- G.H. Hardy

Monday, August 28, 2006

New Book on Web Mining

Web Data Mining - Exploring Hyperlinks, Contents and Usage Data, written by Bing Liu, is going to be published at the end of year 2006.

Sunday, August 27, 2006

Jon Kleinberg got one more prestigious award

The Nevanlinna Prize at the ICM2006 has been awarded to Jon Kleinberg. His work is on the mathematics of Information Networks.

Thursday, August 10, 2006

The Porter Stemming Algorithm

The 'official’ page for distribution of the Porter Stemming Algorithm has been revised and moved.

For Python (and other script languages), it is probably more efficient to use a C-implemented module (e.g., nltk.stemmer.porter) rather than taking the Python code straightforwardly.

Friday, April 21, 2006

Minorthird

MinorThird is a collection of Java classes for storing text, annotating text, and learning to extract entities and categorize text. It was written primarily by Dr William W. Cohen. It comes with a collection of publically-available extraction problems in Minorthird format (about 2Mb).

Minorthird differs from existing NLP and learning toolkits in a number of ways:


  • Unlike many NLP packages (eg GATE, Alembic) it combines tools for annotating and visualizing text with state-of-the art learning methods.

  • Unlike many other learning packages, it contains methods to visualize both training data and the performance of classifiers, which facilitates debugging.

  • Unlike other learning packages less tightly integrated with text manipulation tools, it is possible to track and visualize the transformation of text data into machine learning data.

  • Unlike many packages (including WEKA), it is open-source, and available for both commercial and research purposes.

  • Unlike any open-source learning systems I know of, it is architected to support active learning and on-line learning, which should facilitate integration of learning methods into agents.