:- pcaboche’s hub
Hello!
I am a software engineer currently living in Singapore.
LinkedIn profile: pierre-caboche
I’m using this page to talk about technology…
Articles are usually available for download in PDF format (generated using LaTeX).
Articles
Solr / ElasticSearch
-
Solr and ElasticSearch: index, faceting, and analysing the lyrics of Japanese songs
165 pages, 12th September 2022
A hands-on introduction to Solr, ElasticSearch. The case-study uses faceting, aggregations to analyse the type of vocabulary used in a variety of Japanese songs, and compare the writing styles of popular bands / songwriters (40+ bands, 7000+ songs).
View LaTeX code View online Download PDF
Data Engineering
-
Python, AWK, Perl, Julia: Processing large data files
44 pages, 2nd July 2022
"How to process large data files easily?"
This is the question we are trying to answer, by comparing 4 scripting languages, based on different criteria (features, syntax, performance, ease of use). For our tests, we take a large data file and try to put that data in different "buckets" based on some conditions.
View LaTeX code View online Download PDF
LaTeX
-
Getting started with LaTeX
81 pages, 15th July 2022
This article will guide you in your journey with LaTeX: from installation, rendering your first document... to the different components that may be needed for writing a detailed article or even a thesis (e.g. cross-references, quotes, bibliography, index, etc). All of these topics are conveniently compiled in one document, and presented in a way that is easy to follow.
The article also highlights some of the strengths of LaTeX, and the workflows that can make you more productive.
View LaTeX code View online Download PDF
-
LaTeX: rendering texts in Japanese and Chinese
32 pages, 11th July 2022
This article explains how to handle non-Latin writing systems in LaTeX, with a focus on Japanese and Chinese.
It also explains how to add pinyin, furigana, and other ruby characters (i.e. the annotations on top of kanji characters, and usually used to indicate the pronunciation or to remove ambiguity).
The article also focuses on portability. The solutions presented should be both free, and compatible with multiple systems: Windows, Linux, etc.
View LaTeX code View online Download PDF