新模型学到头秃?gobbli统一模型库帮你快速上手文本分类,内置BERT、fastText等
机器之心报道
参与:一鸣、张倩
模型太多往往也是个问题,特别是开发者需要逐个学习每个模型的使用方法。最近,RTI International 公司的数据科学家们开发了一个统一的语言模型库 gobbli。用户可以像使用 Keras 那样直接上手文本分类任务,还有很多知名的语言模型可以选择,如 BERT 等。

项目地址:https://github.com/RTIInternational/gobbli
使用文档:https://gobbli.readthedocs.io/en/latest/quickstart.html
from gobbli.experiment import ClassificationExperiment
from gobbli.model import MajorityClassifier
X = ["This is positive.","This is negative.","This is bad.","This is good.","This is really bad.","This is really good.","This is pretty good.","This is pretty bad.",]
y = ["Good","Bad","Bad","Good","Bad","Good","Good","Bad",]
exp = ClassificationExperiment(model_cls=MajorityClassifier,dataset=(X, y))
results = exp.run()
from gobbli.io import TrainInput
train_input = TrainInput(
# X_train: A list of strings to classify
X_train=["This is a training document.","This is another training document."],
# y_train: The true class for each string in X_train
y_train=["0", "1"],
# And likewise for validation
X_valid=["This is a validation sentence.","This is another validation sentence."],
y_valid=["1","0"],
# Number of documents to train on at once
train_batch_size=1,
# Number of documents to evaluate at once
valid_batch_size=1,
# Number of times to iterate over the training set
num_train_epochs=1)
from gobbli.model import MajorityClassifier
clf = MajorityClassifier()# Set up classifier resources -- Docker image, etc.
clf.build()
train_output = clf.train*(train_input)
from gobbli.io import PredictInput
predict_input = PredictInput(
# X: A list of strings to predict the trained classes for
X=["Which class is this document?"],
# Pass the set of labels and the trained checkpoint
# from the training output
labels=train_output.labels,
checkpoint=train_output.checkpoint,
# Number of documents to predict at once
predict_batch_size=1)
predict_output = clf.predict(predict_input)

pip install gobbli
https://github.com/RTIInternational/gobbli
关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/
关注网络尖刀微信公众号随时掌握互联网精彩
赞助链接
排名
热点
搜索指数
- 1 习近平同马克龙交流互动的经典瞬间 7903986
- 2 黑龙江水库冰面下现13匹冰冻马 7808885
- 3 微信表情包戒烟再度翻红 7712101
- 4 2025你的消费习惯“更新”了吗 7618291
- 5 三星堆与秦始皇帝陵竟有联系 7521656
- 6 为啥今年流感如此厉害 7423607
- 7 劲酒如何成了年轻女性的神仙水 7333465
- 8 首次!台湾浅滩海域搜救应急演练举行 7233922
- 9 一身塑料过冬?聚酯纤维真是塑料瓶吗 7140848
- 10 中疾控流感防治七问七答 7041605







机器之心
