This project aims to develop an efficient rule based extractor of references, located in scientific papers in English language. The application takes a pdf file or a directory of pdfs and then returns an html file, containing the list of all entries with their respective title. Moreover the title of the v article is searched through Google Web Service to get the URL that identifying the article on the web. If the URL provides on the page a Bibtex entry, this will appear in the html under the relative entrie, taken from some websites such as citeseer, ieeexlpore etc. The application does not make search over pdf file based on images. The project is released under the GNU General Public License.

Involved Technologies: Python, Python Frameworks, PyDev, Google API, RegEx.

Released on Google code at http://code.google.com/p/pdftoref/

Year: 2008.

See the live demo below:



About admin

Iacopo Masi was born on September 6, 1983 in Florence. He received a laurea degree in computer engineering from the University of Florence, with a thesis on "Feature-based Localization and Mapping of Wide Areas with a PTZ Camera" in 2009. He's currently working at Visual Information and Media Lab at Media Integration and Communication Center, University of Florence. His main research interests are focused on application of pattern recognition and computer vision specifically in the field of video-surveillance.
