Project

rbbt-text

0.01
A long-lived project that still receives updates
Text mining tools: named entity recognition and normalization, document classification, bag-of-words, dictionaries, etc
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Runtime

>= 0
>= 4.0.0
 Project Readme

Install¶ ↑

1 - install rvm: rvm.beginrescueend.com/ (will need git and curl installed)

:script

curl -L https://get.rvm.io | bash -s stable --ruby

cd
cat .bash_profile > tmp.bash_profile
echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && . "$HOME/.rvm/scripts/rvm"' >> tmp.bash_profile
cp .bash_profile .bash_profile.save
mv tmp.bash_profile .bash_profile
. .bash_profile
rvm install 1.9.3

2 - install tokyocabinet (Intructions for user install follow. For system level, make sure we have dev package)

1 - download source from http://fallabs.com/tokyocabinet/tokyocabinet-1.4.47.tar.gz
2 - unpack and compile using a user-level prefix
3 - set LD_RUN_PATH and LD_LIBRARY_PATH to point there

:script

cd
mkdir -p tmp/tokyocabinet   
cd tmp/tokyocabinet
wget "http://fallabs.com/tokyocabinet/tokyocabinet-1.4.47.tar.gz"
tar xvfz tokyocabinet-1.4.47.tar.gz
cd tokyocabinet-1.4.47
    ./configure --prefix="$HOME/software/opt/tokyocabinet"
    make
    make install
cd
cat .bash_profile > tmp.bash_profile
echo "export LD_RUN_PATH='$LD_RUN_PATH:$HOME/software/opt/tokyocabinet/lib'" >> tmp.bash_profile
echo "export LD_LIBRARY_PATH='$LD_LIBRARY_PATH:$HOME/software/opt/tokyocabinet/lib'" >> tmp.bash_profile
echo "export PATH='$PATH:$HOME/software/opt/tokyocabinet/bin'" >> tmp.bash_profile
cp .bash_profile .bash_profile.save2
mv tmp.bash_profile .bash_profile
. .bash_profile

3.pre1 - If using ruby 1.9 install these gems from github so that some issues with 1.9 are fixed

:script

gem install specific_install hoe
gem specific_install -l https://github.com/bensomers/png.git
gem specific_install -l https://github.com/mikisvaz/tokyocabinet_19_fix.git

3.pre2 - A couple of gems are better installed beforehand, since they require some configuration

:script

# RSRuby
## Example in ubuntu
gem install rsruby -- --with-R-dir=/usr/lib/R/lib/ --with-R-include=/usr/share/R/include

## Example in MAC
gem install rsruby -- --with-R-dir=/Library/Frameworks/R.framework/Resources/

# JRB
export JAVA_HOME="full_path_to_jdk"
gem install rjb

3 - install gems rbbt-util rbbt-sources rbbt-text rbbt-phgx, …

:script

gem install rbbt-util rbbt-sources rbbt-text rbbt-phgx rbbt-entities rbbt-views rbbt-dm rbbt-GE

3.bis - Or install github repos and make ruby use them

:script

cd
cd git/
git clone git@github.com:mikisvaz/rbbt-util.git
git clone git@github.com:mikisvaz/rbbt-sources.git
git clone git@github.com:mikisvaz/rbbt-text.git
git clone git@github.com:mikisvaz/rbbt-phgx.git
git clone git@github.com:mikisvaz/rbbt-entities.git
git clone git@github.com:mikisvaz/rbbt-dm.git
git clone git@github.com:mikisvaz/rbbt-rest.git
git clone git@github.com:mikisvaz/rbbt-studies.git
alias druby="env RBBT_LOG=0 ruby $(for d in $HOME/git/rbbt-*;do echo -n "-I$d/lib ";done)"
alias drake="env RBBT_LOG=0 rake $(for d in $HOME/git/rbbt-*;do echo -n "-I$d/lib ";done)"

4 - Install other gems that you might need for some of my workflows

script

gem install redcarpet thin

5 - Set up R to find the helper lib

:script

echo "source(system('rbbt_Rutil.rb', intern =T));" >> $HOME/.Rprofile

6 - If you plan to run workflows you might want to redefine the default directory to place your workflows (~/.workflows)

:script

mkdir -p ~/.rbbt/etc
mkdir -p ~/git/workflows
echo "~/git/workflows/" > ~/.rbbt/etc/workflow_dir