RONDHUIT Service PDF Brochure

Next Generation Search (NLP & ML)

RONDHUIT has been researching and developing natural language processing (NLP) as well as machine learning (ML) and passing the result through our consulting service to customers so that they can use and manage search functions “more conveniently”, “more intelligently”, and “more easily”. You can combine the knowledge obtained from our services with systems including your current Lucene/Solr to develop them into next-generation search system. Followings are a few of related information materials.

livedoor News Corpus


This corpus is from news stories in “livedoor news” administered by NHN Japan and only the following ones that are governed by Creative Commons license were collected and had as many HTML tags as possible deleted.

Collection Timing: Downloaded in early September, 2012. (plain text) : ldcc-20140209.tar.gz download (for Apache Solr) : livedoor-news-data.tar.gz Use this URL when you quote them in any paper.


Creative Commons license (display – revision prohibited) applies to each article file. Refer to corresponding LICENSE.txt in the subdirectory of extracted download file for crediting as they differ depending on news categories. livedoor is a registered trademark of NHN Japan.


RONDHUIT would like to express our sincere gratitude to the NHN Japan for releasing a part of “livedoor news” under the Creative Commons license.

White Papers

▲ Back to the Top

Seminar Materials



Lucene/Solr Workshop 16@Recruit Technologies (2015)

Solr Workshop 15@Recruit Technologies (2014)

Solr Workshop 14@Recruit Technologies (2014)

Solr Workshop 12@VOYAGE GROUP (2013)

  • ManifoldCF and Solr

Solr Workshop 10@VOYAGE GROUP (2013)

IBM Power Systems Solution Seminar 8@IBM Japan (2012)

Solr Workshop 8@VOYAGE GROUP (2012)

Solr Workshop 6@EC Navi (2011)

Solr Workshop 5@EC Navi (2011)

Building Search System with Open Source@SCSK (2010)

Solr Workshop 3@EC Navi (2010)

Next Generation Search Technology Forum 2010

RONDHUIT gave a speech at Next Generation Search Technology Forum 2010

Free Seminar (2008)

LinuxWorld Expo/Tokyo 2008

Takeshi Nakano, Recruit
Koji Sekiguchi, RONDHUIT

Recruit and RONDHUIT co-spoke at LinuxWorld Expo/Tokyo 2008 (Tokyo Big Site)

▲ Back to the Top

Sample Codes on Books/Magazine Articles

Links to Articles within the Site

Lucene Revolution 2014 2014 @Washington, D.C. Business Trip Report

ApacheCon 2014 NA @Denver Business Trip Report

Lucene Revolution 2014 2013 @San Diego Business Trip Report

Lucene Revolution 2014 2012 @Boston Business Trip Report

Apache Lucene Eurocon 2011 @Barcelona Business Trip Report

Security Warnings

Apache ManifoldCF Related Articles

Named Entity Extraction Server – NExTR on Rails (GPL2)

NExTR on Rails is a partially altered OSS version of NExTR – a Ruby port of NExT named entity extraction tool developed at Mie University – that enables you to use it from Rails. The license applied is GPL2. NExTR on Rails is provided as an appliance for VMWare virtual environment. Download and extract the file and run from VMWare. Log in with “next” as user name and “chasen” as the password.

Related Links