Friday, November 16, 2018

Data Storage Systems

My presentation from from PiterPy Meetup #10 Hardcore about the data structures used in databases for storing and retrieving data.
Two approaches to data processing are considered: OLTP and OLAP.
SQL, NoSQL and New SQL databases are discussed.
The tradeoffs that the developers face when creating storage systems are shown.
Also the methods of data storage and interaction with the database provides CPython are considered.
The presentation and the list of references and books helps more easily navigate the data storage engines and understand which tool is better suited for a particular task.

Tuesday, October 16, 2018

SPARROW theorem and RUM Conjecture

I've found a post in RocksDB blog about the SPAce, Read Or Write (SPARROW) theorem, which states that:
1. Read Amplification (RA) is inversely related to Write Amplification (WA)
2. Write Amplification (WA) is inversely related to Space Amplification (SA)

Seems like the same, but more detailed principles are described in RUM Conjecture paper.

Sunday, October 14, 2018

Периодическая таблица структур данных

Группа исследователей из Гарварда под руководством профессора Стратоса Идреоса опубликовала интересную работу "The Periodic Table of Data Structures" [1] (Периодическая таблица структур данных или даже Периодическая система структур данных), в которой они пытаются систематизировать существующие структуры данных путём их декомпозиции на части – основные принципы, то есть минимальные структуры данных (например связный список).
Полученная в итоге модель позволяет увидеть какие структуры данных ещё остаются малоизученными или требуют оптимизации, а также получить лучшие результаты производительности путём комбинирования и настройки существующих структур данных.

По мнению авторов, периодическая таблица структур данных, поможет ускорить исследования и принятие решений при проектировании и разработке новых структур данных.

В работе также предлагается понятие design continuums (континуумы дизайна или континуумы проектирования) – часть пространства проектирования структур данных, которое можно рассматривать как единое, потому что оно определяется одним и тем же набором основных принципов. Структурируя пространство проектирования, можно заметить такие континуумы в классах структур данных, которые традиционно считались принципиально разными.

Данная работа тесно связана с другой работой под названием "Data Calculator" [2, 3] (Калькулятор данных), в которой исследователи пытаются разработать интерактивный, полуавтоматический подход для генерации и проектирования структур данных.

Ранее они же предложили гипотезу RUM про которую я уже писал в блоге.

[1] S. Idreos, et al., “The Periodic Table of Data Structures,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 41, no. 3, pp. 64-75, 2018.
[2] S. Idreos, K. Zoumpatianos, B. Hentschel, M. Kester, and D. Guo. The Data Calculator: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models. SIGMOD, 2018.
[3] S. Idreos, K. Zoumpatianos, B. Hentschel, M. Kester, and D. Guo. The Internals of the Data Calculator, 2018.

Tuesday, October 9, 2018

Epoch protection

Epoch protection is a technique to avoid expensive synchronization between threads.

[1] Keir Fraser: Practical lock-freedom
[2] Faster: A Concurrent Key-Value Store with In-Place Updates

Monday, September 17, 2018

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

Finally I've finished reading this awesome book by Martin Kleppmann. It is one of the best technical books I have read. A Lot of useful information with great explanations and schemes.
Special thanks for references after each chapter and of course beautiful maps between chapters :)
This book sparked my interest in databases even more. Thank you Martin for such a good reading! Can't stop to recommend it to everyone!

Thursday, September 13, 2018

LSM tree, Memtable and SSTable

LSM tree (Log-structured merge-tree) is a data structure with performance characteristics that make it attractive for providing indexed access to files with high insert volume, such as transactional log data. LSM trees, like other search trees, maintain key-value pairs. LSM trees maintain data in two or more separate structures, each of which is optimized for its respective underlying storage medium; data is synchronized between the two structures efficiently, in batches [1].
LSM trees are used in data stores such as Bigtable, HBase, LevelDB, RocksDB, ScyllaDB, Apache Cassandra, MongoDB, SQLite4, Tarantool, WiredTiger, InfluxDB, etc.

Memtable is in-memory data-structure  that holds the data before it flushed to the SSTables.
In LevelDB, RocksDB and Cassandra the implementation of Memtable is based on skiplist [2] data-structure [3, 4, 5].

SSTable (Sorted Strings Table) is a key-value based persistent, ordered immutable storage [6].
The SSTable contains a sequence of blocks (typically 64 KB in size). At the end of the SSTable an index block is stored.

[1] Log-structured merge-tree
[2] Skip list
[3] Reviewing LevelDB: Part V, into the MemTables we go
[4] MemTable
[5] Apache Cassandra Github: SASIIndex
[6] Bigtable: A Distributed Storage System for Structured Data

Monday, August 20, 2018

PEPFS - Writing file system in CPython

Writing file system in CPython

On last PiterPy meetup I've made a talk about writing simple file system PEPFS in CPython using FUSE (File System In Userspace).

The PEPFS project is available on github:
PEPFS is a simple read-only file system where files are CPython PEPs.
To build file system a fusepy module was used.

Slides are available here and here

Wednesday, July 11, 2018

CPython logo

On last Python meetup I've made a talk about CPython logo, its history, authors and meaning.

Slides are available here and here

Please let me know if I you find any mistakes and inaccuracies

And here the embedded presentation:

Thursday, May 10, 2018

Seven Concurrency Models in Seven Weeks

Just finished reading Seven Concurrency Models in Seven Weeks book by Paul Butcher.

This book is the best overview of concurrent/parallel programming models I've ever seen.

Highly recommend for everyone interested in distributed systems and concurrent/parallel programming paradigms.

Tuesday, May 1, 2018


Last week my friend asked me about 0x80 bit that sometimes set in Python marshalled objects data in *.pyc file.
So I decided to understand what kind of flag it is and why it is used.

This is FLAG_REF — a special flag that was added in Python 3.4, and used in conjunction with TYPE_REF type code value. For source code look at:
In the current version (Python 3.7) everything is the same.

TYPE_REF = 'r'
FLAG_REF = 0x80

All currently used type code values are < 0x80 (the highest type code is TYPE_DICT = '{', i.e. 0x7b), so it was decided to store this flag right in the highest bit of type code value.

Now let's see how this flag works.

When marshal_load reads *.pyc file with read_object function, it calls r_object to read marshalled binary data object by object.
To understand the type of an object to read, a single byte with code is read.

flag = code & FLAG_REF to understand if FLAG_REF is set.
type = code & ~FLAG_REF to get only 7 bits (as all type code values are < 0x80).

Also a special macro R_REF is defined. This macro reads unmarshalled object into special refs list with r_ref function only if FLAG_REF is set.
NB: on write (for mashalling) the hashtable, instead of the list, is used.

So when type is equal to TYPE_REF, then only the reference index is read, thus eliminating the need to unmarshall the already unmarshalled object again.

Tuesday, April 24, 2018

Compressed Rich Text Format (RTF) aka LZFu compression and decompression in Python

I uploaded my package for compression and decompression compressed RTF (aka LZFu or MELA) files to PyPI. This package is written in pure Python, and supports Python 2.7 and 3.x versions.

Now it can be easily installed using pip command:
pip install compressed_rtf

The package is based on Rich Text Format (RTF) Compression Algorithm

Sources are available on github:

Thursday, April 19, 2018

RuCTF Finals 2018

This year I've decided to go to Yekaterinburg for the final of RuCTF 2018 to visit conference, watch the game (this time I was a spectator, because only students can play), see friends and and spent good time with a team.

We also visited local zoo.

Conference flew by quickly, thanks to guys who made talks and thanks to guys from HWV. You can find some presentations here:

One more event on RuCTF is battles. Two people from different teams solve the task within 15 minutes. I've solved matrix task with my team mates :)

RuCTF finals are held in Yeltsin Center with a huge display with game monitor.

This year out team SiBears finished in the 8th place (out of 26 teams in total, from 6 countries, 2 guest and 1 school team).

1st place: Bushwhackers from Moscow State University
2nd place: c00kies@venice from Ca'Foscari University
3rd place: saarsec from Saarland University

It was really hard game. More than 10 hours of stress. Out team and some other teams had an accidental power outage during the game :(
By the way the game was delayed due to some problems with vulnerable image's disk.

After the game we made team photos and went to the after-party where guys played "Nalivaika" game, where it was required to answer security-related questions several rounds. Top 10 contestants drank Vodka after each round.

In the evening we gathered on the kitchen of our hotel to eat and discuss the game, team captain even has the strength to make some fixes in our exploit farm :)

Thanks RuCTF organizers, Hackerdom and all participants. 
RuCTF is friendship!

Monday, March 19, 2018

Twin paradox and distributed systems

In the future we definetely will travel through the Universe. And the speed of the starships will be close to the speed of light, or even probably beyond it (as you can see in some movies). Therefore according to the twin paradox all the clocks will have different time. Intergalactic distributed computer network will be really difficult, and eventually (I hope so) constistent.

So, I have some questions:
How will the clocks in computers on the starships, stations, and planets be syncronized?
There are special data structure called CRDT (conflict-free replicated data type) but is also uses timestamps.
Probably we need to find some source/invariant of global time (but how can we break the laws of physics (may be quantum physics can help here)), or find a way to make amendments base on straship acceleration/speed/traveled distance if it is possible.

Saturday, March 17, 2018

iCTF 2018

Today the historical event happened. The team SiBears I'm playing with won first place in iCTF 2018 CTF contest.

First time I've played iCTF in 2011. And it was hard but really interesting. And I'm very proud that out team has finally won, and finished at first place, because this is one of the first (if not the first) classic CTF competitions. Thanks to the UCSB iCTF team for such great event.

Friday, March 16, 2018

Surely You're Joking, Mr. Feynman!

Recently I've finished reading the book "Surely You're Joking, Mr. Feynman!". This book was recommended in the book "A Mind For Numbers: How to Excel at Math and Science (Even If You Flunked Algebra)".
really enjoyed reading it. The great and funny stories from the life of the great man.
Recommend this book to everyone, it is really good.

Tuesday, March 13, 2018

Гипотеза RUM

Группа разработчиков из DASlab (Лаборатория систем обработки данных) Гарвардской школы инженерных и прикладных наук Джона А. Полсона (SEAS) разработали гипотезу RUM (Read, Update, Memory, то есть Чтение, Обновление/Запись и Память).

Гипотеза RUM описывает повсеместный компромисс между оптимальным чтением, записью и используемой памятью у методов доступа к современным системам данных.

При построении методов доступа к современным системам, каждый сталкивается с одними и теми же фундаментальными задачами и проектными решениями.
Есть три величины, которые исследователи всегда стараются свести к минимуму:
  • накладные расходы на чтение (R)
  • накладные расходы на обновление/запись (U)
  • накладные расходы на память (оперативная память, диск, и др.) (M)
три этих параметра (накладные расходы RUM) образуют треугольник:
Накладные расходы RUM - на чтение, запись и используемую память.

Гипотеза RUM:
разработка методов доступа, которые устанавливают верхнюю границу для двух (из трёх) накладныех расходов (оверхедов) RUM, приводит к жёсткой нижней границе для третьего (оверхеда), которая не может быть дополнительно уменьшена.

TLDR: Чтение, Обновление/Запись, Память  оптимизация двух за счет третьего.

Накладные расходы на чтение выражаются в усилении чтения (read amplification), то есть отношении объёма фактически прочитанных данных к объему необходимых данных.
Аналогично накладные расходы на обновление/запись выражаются в усилении записи (write amplification), а накладные расходы на память в усилении занимаемой памяти (space amplification).
Здесь усилении записи (write amplification)  это отношение объёма фактически записанных данных к реально изменённому, а усиление занимаемой памяти (space amplification) — отношение объёма данных + объём вспомогательных данных таких как индексы и другие метаданные, к объёму самих данных.

При разработке методов доступа к системам, существует множество подходов, каждый из которых предназначен для конкретного случая использования. Например, чтобы минимизировать стоимость чтения данных, можно использовать индексы на основе хэш-таблиц или структуры данных с логарифмической сложностью, такие как Hash table, B-Tree, Skiplist, Trie и др.

Структуры с оптимизацией записи  уменьшают накладные расходы на запись, сохраняя обновления во вторичных структурах данных. Основная идея заключается в объединении обновлений с последующим применением к системе управления данными. Примеры: LSM-tree, Partitioned B-tree (PBT), алгоритм Materialized Sort-Merge (MaSM), алгоритм Stepped Merge и Positional  Differential Tree (PDT), и др.

Способы эффективного использования памяти предназначены для уменьшения накладных расходов на хранение данных. К ним относятся методы сжатия и индексы с потерями, такие как Фильтр Блума, HyperLogLog, Count-min sketch, Bitmap, Zonemap, Sparse индексы, и др.

[1] The RUM Conjecture - DASlab @ Harvard University
[2] Manos Athanassoulis, Michael S. Kester, Lukas M. Maas, et al.: “Designing Access Methods: The RUM Conjecture,” at 19th International Conference on Extending Database Technology (EDBT), March 2016. doi:10.5441/002/edbt.2016.42

Sunday, February 25, 2018

A Mind For Numbers by Barbara Oakley

I've finished a book "A Mind For Numbers: How to Excel at Math and Science (Even If You Flunked Algebra)"
Good book with good advices and techniques of how to effectively learn, some of them I have used many times :)
I wish I have read this book earlier. Highly recommend this book to students and pupils.

Wednesday, February 7, 2018

Buzzword poem generator algorithm

Conditions of the problem
Given: a list of buzzwords.
Problem: generate poem from these buzzwords.

We also need to know:

1) Result poem lines number.
2) Rhyme scheme.
3) Syllables in each line.

4) Minimum number of words in each line.
5) Somehow know the number of syllables in each buzzword.
6) Somehow get the list of rhyming buzzwords.

1) Knowing the total number of syllables in each line  generate all combinations of sums (compositions) from all available number of syllables.
E.g. we have buzzwords with the number of syllables 1, 2 and 3, and the total number of syllables in line (i.e. sum of syllables) equal 3. Compositions of 3 are:
+ 1 + 1
+ 2
+ 1
If we assume that we don't have buzzwords with number of syllables 2 then only two possible combinations can be: 1 + 1 + 1 and 3.
NB: Each positive integer n has 2 ** (n - 1) distinct compositions.

2) Knowing the number of syllables in each line, generate the base (I call it so) for future poem. The base is generated from random compositions for each number of syllables.
E.g. for two line poem with 3 and 4 syllables, the possible base can be the next:
[2, 1]
[1, 2, 1]
[2, 1]
[2, 2] 
The poem base can be also limited to the minimum number of words per line. 
For the example above with the minimum number of words equal 3, possible base can be the next:
[1, 1, 1]
[1, 1, 2]
or (notice that the first line has only one possible option):
[1, 1, 1]
[1, 1, 1, 1]

3) Knowing the rhyme scheme it is possible to get the last number of syllables for each line from poem base and group these numbers by rhyme scheme letters, thereby simplify the search of rhyme buzzwords.
The last number of syllables for each line are taken because only the last words form a rhyme.
For example the rhyme scheme is ABA, and the poem base is:
[1, 2, 1]
[1, 1, 2]
[3, 2]
Last numbers of syllables for each line are 1, 2 and 2.

1 A
2 B
2 A
Group them by rhyme scheme letters:
A: [1, 2], B: [2]

It is required to find all buzzwords that are rhyme, and have 1 and 2 syllables, then randomly select one
 from the list of found variants for A letter.
And find only one random buzzword with 2 syllables for B letter.
It is important that after using a buzzword, it must be removed from the list of buzzwords to avoid duplication.
If the rhymes was not found it is required to indicate that, for example by returning empty result.

4) To this step we already have poem base, and rhyme scheme letters mapped to the found buzzwords. In case of empty mapping of rhyme scheme letters to buzzwords — go to step 2 and try again.
If the mapping of rhyme scheme letters to buzzwords isn't empty, start filling the poem base with buzzwords according to the number of syllables. The filling of each poem base line must be performed until the penultimate number of syllables, because the last ones were already found in previous step and saved in the mapping of rhyme scheme letters to buzzwords.
As in the previous step, it is important that after using a buzzword, it must be removed from the list of buzzwords to avoid duplication.
If the filling of the poem was not succeed — go to step 2 and try again.
If the number of attempts is too big, stop trying and display the error message.


Support the poem metrical feet, to generate more smooth poems.

This algorithm was used in my project:

Tuesday, February 6, 2018

Buzzword poem generator

Last week I've created a new project on my github — a tool for the generation of poems from the buzzwords. I called it Buzzword poem generator. As usual, the tool is written in pure Python and supports 2.7 and, of course, Python 3.x.

The project was inspired by recent tweet from Lena Hall (@lenadroid on twitter):

My tool generates poems based on the list of buzzwords. It is possible to pass the parameters of the rhyme scheme and number of syllables in each line. For example for the rhyme scheme ABAB, and 7 syllables per each line, the next poem was generated:

$ python -r ABAB -s 7 7 7 7

Memcached Vault Impala Spark
Redshift Chef Celery Swarm
Haskell Rust Zookeeper Splunk
Python Flink Kinesis Storm

Use it with pleasure :)

CPython github repository bots

I've watched a Monty Python and the Holy Grail movie recently. Despite the fact that it was filmed in 1975, the quality of the video and the jokes are great. Recommend to everyone this British classic comedy.

Fun facts: the bots in CPython repository on github, are called after characters from this movie.
For example the next bots:

Have fun!

Saturday, January 27, 2018

Harry Potter and the Methods of Rationality

I've finally finished reading the book Harry Potter and the Methods of Rationality (HPMOR).
What can I say: it is amazing book. One of the best I've ever read.
10/10 Excellent work. I will definitely recommend it to everyone.

I wish the author to live long, and prosper - EXPECTO PATRONUM! :)

Thursday, January 25, 2018

Contribute to CPython

Today I tried to contribute to CPython. My pull request was small and changes minor, I found a mistake in HAMT algorithm description in the comments.

After sending my pull request and my username was checked by the bot "the-knights-who-say-ni", and the bot said the next message with "CLA not signed" badge.

I was really surprised. Despite this my fix was accepted and merged.
But here are the steps needed in order to sign the CLA (from
  1. If you don’t have an account on (aka b.p.o), please register to create one.
  2. Make sure your GitHub username is listed in the “Your Details” section at b.p.o.
  3. Fill out and sign the PSF contributor form. The “ username” requested by the form is the “Login name” field under “Your Details”.
After at least one US business day "Contributor Form Received Yes on " and "Is Committer <...>" status will appear in your profile details on b.p.o site.

Since this time your pull requests on github will be with "CLA signed" badge.
It is also required to provide a link like "" to the issue in PR description and specify a bpo- in PR title.