You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*[luigi](https://github.com/spotify/luigi) - A module that helps you build complex pipelines of batch jobs.
763
761
*[mrjob](https://github.com/Yelp/mrjob) - Run MapReduce jobs on Hadoop or Amazon Web Services.
764
-
*[PySpark](http://spark.apache.org/docs/latest/programming-guide.html) - The Spark Python API.
765
762
*[streamparse](https://github.com/Parsely/streamparse) - Run Python code against real-time streams of data. Integrates with [Apache Storm](http://storm.apache.org/).
766
763
767
764
## Microsoft Windows
@@ -788,14 +785,14 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
788
785
789
786
*Libraries for working with human languages.*
790
787
788
+
*[gensim](https://github.com/RaRe-Technologies/gensim) - Topic Modelling for Humans.
791
789
*[Jieba](https://github.com/fxsjy/jieba) - Chinese text segmentation.
792
790
*[langid.py](https://github.com/saffsd/langid.py) - Stand-alone language identification system.
793
791
*[NLTK](http://www.nltk.org/) - A leading platform for building Python programs to work with human language data.
794
792
*[Pattern](http://www.clips.ua.ac.be/pattern) - A web mining module for the Python.
795
793
*[SnowNLP](https://github.com/isnowfy/snownlp) - A library for processing Chinese text.
796
794
*[spaCy](https://spacy.io/) - A library for industrial-strength natural language processing in Python and Cython.
797
795
*[TextBlob](https://github.com/sloria/TextBlob) - Providing a consistent API for diving into common NLP tasks.
798
-
*[TextGrocery](https://github.com/2shou/TextGrocery) - A simple, efficient short-text classification tool based on LibLinear and Jieba.
0 commit comments