Startup Village

Online Booth - Sōseki


Try out AI "Sōseki" at SXSW

Sōseki is a state-of-the-art natural language AI. To increase recognition of how far natural language processing technology has come and demonstrate Sōseki’s ability, we will be showcasing 2 demos at SXSW.

Demo1: Trivia Master

Sōseki is able to search through large databases. We have enabled users to search through Wikipedia using Sōseki, and created a trivia master.
We challenge everyone to ask Sōseki their favorite trivia question! Let’s find out if you can out-smart Sōseki.

>> Try out our trivia master  

Demo 2: SXSW Session Navigator

Sōseki is able to understand documents in a wide-range of topics. We have made a database of session information at SXSW. Users can tweet their interest at @AiSoseki, and Sōseki will tweet back the session that best fits the topic .
Let Sōseki find the session at SXSW that matches your interest!

>> Tweet at AiSoseki (ex. @AiSoseki How to fight the pandemic?)

*Speakers at SXSW are welcome to register additional information about their session to enhance the search suggestions. Please go to https://sxsw2021soseki.squarespace.com for more information 

What is Sōseki

Sōseki is a state-of-the-art natural language AI that understands your private documents, and provides answers to questions based on the knowledge accumulated in the text.

The natural language processing technology powering Sōseki is proven to be top class . It has outperformed humans as well as all other AI models in it obtains state-of-the-art results on five well-known datasets:         

  • Open Entity (entity typing)        
  • TACRED (relation classification)        
  • CoNLL-2003 (named entity recognition)        
  • ReCoRD (cloze-style question answering)        
  • SQuAD 1.1 (extractive question answering).

It was ranked second only behind Facebook in the Efficient Open-Domain Question Answering Competition, Systems under 6GB track held at NeurIPS 2020. 

Sōseki’s high performance is realized by its ability to accurately understand the meaning of texts. We have developed a deep learning model that effectively captures entity-based encyclopedic knowledge such as those available in Wikipedia, in addition to the linguistic knowledge captured in existing models. 


Behind the name Sōseki

Natsume Sōseki is a Japanese novelist. Born in 1867, he wrote classics such as “Kokoro”, “Botchan” and “I Am a Cat” and is renowned in Japan and around the world. He currently appears on the 1000 yen bill.

We named our model in respect to his deep insights on our natural language.

In developing our new AI model, we wanted our model to be recognized to have a deep understanding of text and be used around the world. These are characteristics that Natsume Sōseki has, and therefore we chose to name our model after him.


We are Studio Ousia

Sōseki is developed by Studio Ousia, an AI startup in Japan. Since our founding in 2007, we have continued to develop cutting edge natural language processing technology (NLP). 

We have ranked in various academic competitions and in benchmarks related to NLP. We contribute to the AI community by publishing our technology in papers and making them available as open source.

We will continue pushing the envelope of NLP and help corporations accelerate their business. As a startup based in Japan, we hope that we can contribute in making state-of-the-art technology to asian languages and businesses.


Product Category

Social media

Contact details