Minoru Osuka: POV of a main language contributor
This year, we’ve launched Meilisearch's very first contributors program: the Meilistars. The aim of a contributors’ program is to gather top community contributors and foster our relationship, highlight the fantastic work they produce, and maybe even help create connections. We won’t go much in-depth on this topic at the moment as we’re hoping to share more in a dedicated blog post, so watch this space!
As an initiative to bring the spotlight to our amazing community members, we’ve asked if they would be interested in participating in a series of interviews so that we can get to know them better.
And we’re kick-starting our interviews with Minoru Osuka. You may have already met him on GitHub as Mosuka or on Twitter as @minoru_osuka.
Let’s hear all together a bit more from him!
Meet Minoru
First things first, we’ve asked him to introduce himself in his own words!
“I'm Minoru Osuka. I work as a software engineer and tech lead for a company that provides a job search engine in Japan. I'm mainly involved in search platform development. My hobby is software development, and I've released some of the software I developed as OSS on GitHub.”
So not only does Minoru work in tech, but he also considers it his hobby! We were very curious to know how he got into the tech field.
Minoru explains that he started in a technical school as a programming instructor, but he wished to increase his practical experience and decided to join a software development company.
“This was the start of my career as a software developer. As I used search engines in my work, I became interested in how they work, so I moved to an Internet portal site and have worked in the search engine field ever since.”
Minoru + Meiliseach: it was meant to be
Hearing that Minoru actually worked with search engines, we couldn’t help but ask when he heard about Meilisearch and how he started using it.
“It was about May 2022 when Meilisearch started supporting the Japanese language. Meilisearch used the Japanese morphological analyzer Lindera as their Japanese tokenizer. I maintain Lindera, so it was a very fortunate event.”
Minoru is very grateful to Kination and Miiton, who respectively created the first pull request to add Japanese language support in Meilisearch and implemented it. It is thanks to their combined work that the Japanese support of Meilisearch has become what it is today.
“I have yet to actually use Meilisearch in my work, but voluntas introduced a case study using Meilisearch for their Japanese document search service, which attracted a lot of attention.”
Open source to the bone
Minoru's frequent references to other members of the Meilisearch community during the interview were too significant to overlook. He expressed great satisfaction at having had the opportunity to connect with fellow community members.
“Ever since Meilisearch adopted Lindera, my Twitter followers have increased. I am happy to have met them. I'm very grateful to Meilisearch.”
It's truly remarkable how deeply involved Minoru is in the open-source community. In addition to contributing to Meilisearch and maintaining Lindera, he has also built his own distributed search server.
“I had been using Elasticsearch and Solr for a long time, but using them was not enough for me, so I decided to build a distributed search server while also learning Rust. It was tough, but I learned a lot.”
Oddly, it was this project that led him to maintain Lindera:
“I started working on Lindera because I developed a distributed search server on my own. [...] My friend, who is developing a full-text search library, also developed a Japanese morphological analyzer, but it was not registered on crates.io. When I contacted him to see if he would register it on crates.io, he gave me a surprising answer: ‘I want you to take over this project.’ I was also interested in morphological analyzers, so I decided to take over the development.”
Minoru also thanked fulmicoton, the developer of kuromoji-rs, the original software that eventually evolved into Lindera, describing it as “a wonderful OSS.”
A vision for the future
Given his extensive contributions, it's no surprise that Minoru knows Meilisearch inside out. Its immediate usability is what Minoru values most about it. In fact, he particularly values a specific feature that contributes to this accessibility.
“Meilisearch's automatic detection of what language the indexed documents are written in is great. This is a very helpful feature for users unfamiliar with search engines.”
During our conversation with Minoru, we couldn't pass up the chance to ask him about any improvements he'd like to see in Meilisearch's near future. Unsurprisingly, his suggestions focused on language support. Specifically, he suggested implementing a mechanism to normalize characters before they are tokenized.
For those unfamiliar with the process, it currently happens the other way around. A text is tokenized—segmented into words—and then, each word is normalized based on language particularities. For a romance language like French, this process includes lowercasing and removing diacritical marks such as accents or anything that doesn’t impact the meaning of the text. To those interested in the subject, you can join the discussion on GitHub or read more about how we handle language support.
Minoru's suggestions for language support included customizing the normalizer for each field. Suppose there is a document with an address field, and he would like to be able to instruct Meilisearch to convert the Kanji numbers in the address field into Arabic numbers. In his words:
"Right now, Meilisearch does not have a normalizer for Japanese, but it would be nice to be able to customize it for each field […] I would also like to contribute to Japanese normalizers."
We look forward to working on improving our language support with Minoru, and any language aficionados that would want to support our efforts!
It was a real pleasure talking with Minoru, getting to know him better, and understanding his insight on Meilisearch, his usage, and the people he managed to meet through it.
As a reminder, you can find Minoru on GitHub or contribute to Lindera.
We hope you found this interview as interesting as we did and look forward to meeting all of our incredible Meilistars.