Table Of Contents

Next topic

Download

This Page

RaSC (Rapid Service Connector)

Overview

RaSC is a free and open source middleware, developed by the Information Analysis Laboratory at the National Institute of Information and Communications Technology (NICT). RaSC facilitates high-speed and highly parallelized execution of user programs.

RaSC has been developed to apply user programs such as morphological analyzers and dependency parsers to a huge number of Web pages. To this end, RaSC can run user programs and connect them across distributed computation nodes. A typical use of RaSC is to process multiple inputs in a file or in stream in parallel with multi-core CPUs and/or many computation nodes. Although RaSC is originally designed for natural language processing (NLP), RaSC can work with various user programs, not limited to NLP programs. As long as the programs receive inputs from standard input or from a file, and output the result to standard output or to a file, they can be executed in a distributed manner on RaSC with slight changes in most cases.

The process instances of user programs running on RaSC will reside on memory once they started. For this reason, even programs that need a long time to start — for example, NLP programs that load a large dictionary file — can efficiently run. With RaSC, in addition, user programs on remote computer can be easily used through a network. When a number of inputs are given, the inputs can be distributed to multiple computers. The user programs can be easily connected through stream communication like a UNIX pipe, and they are executed in parallel without making users conscious of it (refer to Overview for more details).

The following shows an example of executing the Dependency and Case Structure Analyzer KNP on RaSC. In the example, given an input of 500 lines whose line is one sentence, they are assigned to multiple process instances and parallelized with a multi-core CPU (8 instances of parallel execution on two Intel Xeon X5675), and thereby processed about 5 times faster. The order of input sequences in the original input file (INPUT_TXT) is preserved in the output file (OUTPUT_TXT). (This example can be implemented by the program described in How to parallelize processing on inputs from a pipe.)

$ time cat INPUT_TXT | juman | knp > OUTPUT_TXT  # Directly run a user program without RaSC
real    2m28.456s   # Without parallelization
user    2m17.557s
sys     0m1.011s
$ ./server.sh KNPService 19999 start # Start a RaSC service that runs KNP
$ time cat INPUT_TXT | java -cp ./lib/*: RaSCClient localhost 19999 > OUTPUT_TXT   # Other computer nodes can be accessed by changing the host and port.
real    0m29.402s   # Parallelization with RaSC (8 parallel processes on two Intel Xeon X5675)
user    0m0.566s
sys     0m0.045s

RaSC is used for the large-scale Web information analysis system WISDOM X and Disaster Information Analysis System DISAANA , which are developed by NICT. WISDOM X can apply various analyses, such as dependency analysis, sentiment analysis, and causality extraction, to up to 100 million Web documents per day using RaSC.

RaSC is distributed under LGPL v2.1.

News

Contents

User programs tested on RaSC

The following table lists the user programs that are tested on RaSC. You can download the service definition XMLs required to run the programs on RaSC. The service definition XML is a setting file which specifies the command line for a user program, the number of process instances for parallel execution, or other configuration. For details, refer to Run a user program as a RaSC service (when being used only with MessagePack RPC) and Work with various network protocols (when using JSON RPC, ProtocolBuffers, and SOAP).

Note

Paths to user programs and model files must be set according to your environment in the service definition XML.

User program Service definition XML Remarks
Morphological analyzer MeCab Service definition XML, Service definition XML (with 8-parallel processes)  
Morphological analyzer Juman Service definition XML, Service definition XML (with 8-parallel processes)  
Dependency parser J.DepP Service definition XML, Service definition XML (with 8-parallel processes) A shell script is required to connect with MeCab through a pipe (refer to How to connect multiple user programs through a pipe)
Dependency parser KNP Service definition XML, Service definition XML (with 8-parallel processes) A shell script is required to connect with Juman through a pipe (refer to How to connect multiple user programs through a pipe)
Dependency parser Enju Service definition XML, Service definition XML (with 8-parallel processes)  
GENIA tagger Service definition XML, Service definition XML (with 8-parallel processes)  
Speech recognition engine Julius - Refer to the article by Yuki Igarashi, at Tohoku University (in Japanese).
SVM Perf Service definition XML Apply a patch to the SVM Perf (refer to Use SVM Perf)
CRF++ Service definition XML Apply patch to the CRF++ (refer to Use CRF++)
TinySVM Service definition XML Apply a patch to the TinySVM (refer to Use TinySVM)

Contact

  • rasc-user [at] ml.nict.go.jp: mailing list for questions and requests. Messages posted on this list are shared with all participating users.
    • To join the list, send a mail including the line of #subscribe YOUR NAME to rasc-user-ctl [at] ml.nict.go.jp, and follow the procedure described in the reply mail (* YOUR NAME is NOT a mail address, and must be in ASCII characters).
    • Only participating users can post a message.
  • rasc-contact [at] ml.nict.go.jp: mailing list for contacting administrators. Messages posted on this list are sent to administrators only.

(Please replace [at] with an at mark.)