Posts tagged “Education”

Welcome to "Cultivating DEI" , a series in which Postgres community members share their insight and experience about creating a more diverse and inclusive Postgres environment where all are welcome.

Recently I've been thinking a lot about relationships between the PostgreSQL community and the Database research community. To put it bluntly – these two communities do not talk to each other!

There are many reasons why I am concerned about this situation. First, I consider myself belonging to both of these communities. Even if right now I am 90% in industry, I can't write off my academic past. Writing a scientific paper with the hope of being accepted to the real database conference is something which appeals to me.

Secondly, we want to have quality candidates for database positions. Anyone who has tried recently to fill these positions knows that this is not an easy task. If you are looking at recent college grads, there are almost no chances that you can find somebody who has PostgreSQL experience. Here is where we face the other side of the problem.

The problem is not simply that scientists do not speak at the PostgreSQL conferences, and that PostgreSQL developers do not speak at academic conferences. The larger issue is that for many Computer Sciences (CS) students, their academic research and practical experience do not intersect. They learn about some incredible algorithms, and as part of their coursework they may suggest some enhancements to existing algorithms. They then practice their SQL skills with MySQL, which from my observations lacks so many basic features, that it can hardly be taken seriously as a data platform.

If students practiced using PostgreSQL, they would have a full-scale enterprise ready object-relational database -- not a "light" version, but a robust platform, which supports a multitude of index and data types, constraints, procedural languages and much more.

I've heard from several professors that "MySQL is okay for "learning SQL." I want to ask -- what does "learning SQL" mean? Is it just learning how to write a syntactically correct SQL? One contributing factor to the problem is that MySQL comes on each laptop by default, integrated with basic tools that allow building websites. It is integrated with Wordpress. There is no reason for PostgreSQL not to have similar support, but it is not in place.

This is particularly frustrating when you recognize the amount of database research was completed using Postgres, for Postgres or with help of Postgres; R-Tree and GIST indexes, for example. Also, the SIGMOD Test of Time Award in 2018 went to the paper "Serializable isolation for snapshot databases," which was implemented in PostgreSQL.

I know the answer to the question "why do they not talk?" Researchers do not want to talk at the PostgreSQL conferences, because those are not scientific conferences, and participation in these conferences will not result in a publication. Postgres developers do not present at the CS conferences, because they do not want to write long papers. Even if they do submit something, their papers are often rejected as "not having any scientific value." I have experienced this on multiple occasions.

I came across another example of "why” when I attended the ACM/SIGMOD conference in Amsterdam. I attended a compelling presentation on the problem of cardinality estimation over multi-join queries, that introduced new optimization techniques. The presenter mentioned that he had used Postgres to build the prototype. I was too far back in the room to ask my question, so I reached out via the conference website.

I asked the presenter why he didn't submit a patch. He replied that their approach was hacky, and it needs more work to think about adding it to Postgres. I've asked whether he would be interested in working on it with some PostgreSQL community members. His reply? "Not in the next two years, I've just received a post-doc position at Microsoft, so I can't do it for the next two years."

So yes -- I know the answer as to why these two communities historically do not communicate. However, I do not like or accept it. Perhaps we can talk about and resolve this problem together?!

Contributor Bio:

Henrietta Dombrovskaya is a database researcher and developer with over 30 years of academic and industrial experience. She holds a Ph.D. in Computer Science from University of Saint Petersburg, Russia. She taught Database and Transaction theory at the University of Saint – Petersburg (Russia), as well as multiple database tuning classes for both beginners and advanced professionals.

Her professional experience includes consulting for a number of government projects in Chicago and New York, and providing Data services in the financial sector, manufacturing, and distribution. She is a co-author, with B. Novikov, of the book “System Tuning”, BHV, S.-Petersburg, Russia. Her researches in overcoming object-relational impedance mismatch were publish in the Proceedings of EDBT 2014 Athens and ICDE 2016 in Helsinki. At Braviant Holdings she is happy to have an opportunity to implement the results of her research in practice.

Henrietta Dombrovskaya is a co-organizer of the Chicago PostgreSQL User Group and a member of the Diversity, Equity, and Inclusion Work Group for the Postgres Conference Series. She was recently awarded the 2019 "Technologist of the Year" award by the Illinois Technology Association. This award is  "presented to the individual whose talent has championed true innovation, either through new applications of existing technology or the development of technology to achieve a truly unique product or service."