Elasticsearch in an Hour

Elasticsearch in an Hour



John Berryman

The use of search is ubiquitous. As a developer you need search in your technology tool belt. This talk introduces Elasticsearch, a front-running, open source search technology. We’ll create an application, execute a search, and dive into internals so that you’ll know where search is most useful.

PyOhio is a free (thanks sponsors!) annual conference for Python programmers in and around Ohio and the entire Midwest.

21 thoughts on “Elasticsearch in an Hour

  1. Thanks for creating this tutorial! Elastic recently launched the Elastic Contributor Program where you can earn points for creating tutorials like this, contributing code, and more. I hope to see you on the leaderboard! https://www.elastic.co/community/contributor

  2. hi all, gr8 video! can someone tell me how to use the elasticsearch for a seach api. (drf + elasticsearch). PS: i cant use any high level api like the elasticsearch-dsl . Many thanks:)

  3. Amazing, loved the way you started with sql and making a use case and then going to details of ES.

    Do you have anything similar for sql vs NoSQL dbs.. ???

  4. If you were having a hard time understanding TF-IDF check out: https://moz.com/blog/inverse-document-frequency-and-the-importance-of-uniqueness

    "Document frequency measures commonness, and we prefer to measure rareness. The classic way that this is done is with a formula that looks like this: IDFj = log(n/TFj)"

    For each term we are looking at, we take the total number of documents in the document set (n) and divide it by the number of documents containing our term (TFj). This gives us more of a measure of rareness. However, we don't want the resulting calculation to say that the word "mobilegeddon" is 1,000 times more important in distinguishing a document than the word "boat," as that is too big of a scaling factor.

    This is the reason we take the Log Base 10 of the result, to dampen that calculation. For those of you who are not mathematicians, you can loosely think of the Log Base 10 of a number as being a count of the number of zeros – i.e., the Log Base 10 of 1,000,000 is 6, and the log base 10 of 1,000 is 3. So instead of saying that the word "mobilegeddon" is 1,000 times more important, this type of calculation suggests it's three times more important, which is more in line with what makes sense from a search engine perspective."

  5. Hello John, I have been working on an App that is connected to firebase. I want to implement the elastic search to retrieve matching values in my database. Can i take a moment of your time and help me out?

  6. I wonder if the presenter is unaware of full-text search capabilities in RDBMSes, or just ignoring it for effect?

    The initial example (10:18) of building an SQL query with con-/disjunctions is cute, but also irrelevant because IF you've decided to use MySQL/innodb for search — not saying it's good idea — you would surely use the most powerful tool for it, which is to index text as FULLTEXT and use MATCH AGAINST queries, not LIKE-style queries.

    This would give you stemming and relevance scoring, basically fixing the problems of the example as given.

    That said, you should never use MySQL for anything.

Leave a Reply

Your email address will not be published. Required fields are marked *