Loading…
PrestoCon 2023 has ended
In-person
December 5-6, 2023
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PrestoCon 2023 to participate in the sessions. 

Please note: This schedule is automatically displayed in Pacific Standard Time (UTC -8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Venue." The schedule is subject to change.
Tuesday, December 5
 

9:00am PST

Registration
- Peerless Organic Fair Trade French Roast and Swiss Water
- Process Decaf, assorted Numi teas served with condiments,
*1/2 n 1/2, 2% milk, oat milk
-spring water
-assorted sodas
-sparkling water and flavored sparkling water

Tuesday December 5, 2023 9:00am - 5:00pm PST
2nd Floor Foyer

10:00am PST

[Pre-Registration Required] Getting started with Presto - Yi-Hong Wang & Kiersten Stokes, IBM
In this workshop you’ll learn the basics of Presto, the open-source SQL query engine. You’ll get Presto running locally on your machine, connect data sources, and run some queries.
This is a beginner-level workshop for software developers and engineers who are new to Presto.

Course outline:
  • What is Presto and why you’d use it
  • How to write a Presto query
  • How to create and deploy a Presto cluster on your machine using Docker
  • How to add 2 data sources (MySQL and MongoDB) and query them
  • How to create dashboards/visualizations of your data


Speakers
avatar for Yihong Wang

Yihong Wang

Senior Software Developer, IBM
Yihong Wang is a software developer with the Watson AI and Data Open Technologies group at the IBM Silicon Valley Lab, where he actively works on the Kubeflow, Presto, and Node.js communities. He has been successfully delivering several Kubeflow releases to IBM Kubernetes Service... Read More →
avatar for Kiersten Stokes

Kiersten Stokes

Open Source Software Developer, IBM
Kiersten is an open-source software developer with the Watson AI and Data Open Technologies Group. She has been an active contributor to several AI technologies over the last several years including Jupyter Enterprise Gateway, PyTorch, and Presto. In addition to contributing quality... Read More →


Tuesday December 5, 2023 10:00am - 11:30am PST
Boole Room

1:00pm PST

[Pre-Registration Required] Building an Open Data Lakehouse with Presto and Apache Iceberg - Ajay Gupte & Kiersten Stokes, IBM
You may be familiar with the Data Lakehouse, an emerging architecture that brings the flexibility, scale and cost management benefits of the data lake together with the data management capabilities of the data warehouse. In this workshop, we’ll get hands-on building an Open Data Lakehouse - an approach that brings open technologies and formats to your lakehouse.

This is a beginner-level workshop for software developers and engineers who are building data platforms. We’ll use Presto for the open source SQL query engine, Apache Iceberg to enable ACID transactions, and Minio S3-compatible Object Storage for the data lake. You’ll get hands-on with Presto and Iceberg. We’ll show you how to set up and connect these technologies, how to run queries on your data, and how to access and interpret Iceberg metadata. By the end, you should be well-versed in Presto and Iceberg and have the building blocks to create your own Open Data Lakehouse.

Course outline:
  • Introduction to the Open Data Lakehouse and the Presto query engine
  • Introduction to Apache Iceberg and common use cases
  • Querying S3 data with Presto
  • Integrating Iceberg with Presto
  • Working with Iceberg data and metadata tables
  • Future roadmap – what additional Iceberg support is coming to Presto like time travel and merge-on-read support

Speakers
avatar for Ajay Gupte

Ajay Gupte

Software Engineer, IBM
Ajay is a senior software engineer in Netezza database team. He contributes to the open source projects of presto and apache iceberg connector.
avatar for Kiersten Stokes

Kiersten Stokes

Open Source Software Developer, IBM
Kiersten is an open-source software developer with the Watson AI and Data Open Technologies Group. She has been an active contributor to several AI technologies over the last several years including Jupyter Enterprise Gateway, PyTorch, and Presto. In addition to contributing quality... Read More →



Tuesday December 5, 2023 1:00pm - 3:00pm PST
Lovelace Room
  Workshops

1:00pm PST

[Pre-Registration Required] Getting started with Prestissimo on Docker - Sujit Madiraju & Minhan Cao, IBM
Prestissimo is the project code-name for the new C++ Native Worker for Presto. The new Native worker is a replacement for the Java worker in Presto clusters.

This is an intermediate-level workshop for software developers and engineers who are interested in getting their hands on Prestissimo. We recommend that you are already familiar with Presto and how to run distributed systems.
We will be demonstrating our newly containerized, simple deployment of Prestissimo with Docker (Presto with Velox worker). Attendees will be able to spin up their own instance in minutes and immediately begin playing around with a Prestissimo sandbox, backed by Hive metastore and MinIO object storage.

Speakers
avatar for Minhan Cao

Minhan Cao

Software Architect, IBM
A software engineer at IBM who is currently working on the Prestissimo project team.
avatar for Sujit Madiraju

Sujit Madiraju

Software Developer, IBM
Software dev at IBM, working on integrating Velox into IBM's Lakehouse (watsonx.data).



Tuesday December 5, 2023 1:00pm - 3:00pm PST
Boole Room
  Workshops
 
Wednesday, December 6
 

8:00am PST

Registration
Wednesday December 6, 2023 8:00am - 5:00pm PST
2nd Floor Foyer

8:00am PST

Sponsor Showcase
Coffee + Tea
- Peerless Organic Fair Trade French Roast and Swiss Water
- Process Decaf, assorted Numi teas served with condiments,
*1/2 n 1/2, 2% milk, oat milk
-spring water
-assorted sodas
-sparkling water and flavored sparkling water

Vegetarian Frittata
spinach, onions, peppers, mushrooms, squash and gruyere cheese

Seasonal Breakfast Fruit Cup
The freshest fruit the season has to offer (vegan, GF)

Wednesday December 6, 2023 8:00am - 7:00pm PST
Grand Hall

9:00am PST

Keynote: Welcome & Opening Remarks - Ali LeClerc, IBM & PrestoCon Chair; Girish Baliga, Uber & Chair, Presto Foundation
Speakers
avatar for Girish Baliga

Girish Baliga

Director of Engineering, Uber Technologies Inc
Girish manages Batch Data at Uber, and has previously worked on Search and Real-Time Analytics.  Before that, he spent over a decade optimizing resources, search ads, and geo data analytics at Google, interrupted by a brief start-up stint at Urban Engines. He has a PhD in Computer... Read More →
avatar for Ali LeClerc

Ali LeClerc

PrestoCon Chair, Presto Foundation | IBM
Ali has over a decade of experience in open source software, product, and community marketing. She was most recently Head of Community at Ahana, the SaaS for Presto company, which was acquired by IBM. She works closely with the Presto Foundation to drive open source programs. Prior... Read More →



Wednesday December 6, 2023 9:00am - 9:20am PST
Hahn Auditorium

9:20am PST

Keynote: Presto: The Vertically Integrated Lakehouse Query Engine - Tim Meehan, IBM & Chair, Presto Technical Steering Committee
The Presto project is at an inflection point.  Learn about trends in data management and how the Presto project positioned to be the first choice for analytics on the lake.

Speakers
avatar for Tim Meehan

Tim Meehan

Software Engineer, IBM
Tim is a Software Engineer at IBM and is technical steering committee chair of the Presto Foundation.



Wednesday December 6, 2023 9:20am - 9:50am PST
Hahn Auditorium
  Keynote Sessions
  • Presentation Slides Attached Yes

9:50am PST

Keynote: Presto @ Meta - Naveen Cherukuri, Meta
Presto has been a key engine at Meta for the past decade. In this talk, Naveen will discuss the evolution and usage of Presto at Meta. He’ll share some of the current workloads Presto powers and the use cases behind them, and what the roadmap is going forward.

Speakers
avatar for Naveen Cherukuri

Naveen Cherukuri

Senior Engineering Manager, Meta
I support the Presto and Velox teams at Meta


Wednesday December 6, 2023 9:50am - 10:20am PST
Hahn Auditorium

10:20am PST

Sponsored Keynote: Denodo: Expanding your Data Lake to an Enterprise Data Fabric - Pablo Álvarez-Yanez, Denodo
The Denodo platform is a solution that provides data management capabilities across the distributed modern data landscape. Its goal is to provide a consistent engine to enable and enforce security, data integration, self-service and governance across all data, regardless of location and technology. As part of its capabilities, it includes a distribution of Presto as its data lake engine. Attend this session to learn how Denodo and Presto work together, how Denodo can extend Presto's capabilities in those areas, and other interesting features that bring agility and ease-of-use to a data strategy.

Speakers
avatar for Pablo Álvarez-Yanez

Pablo Álvarez-Yanez

Product Manager, Denodo
Pablo is the Global Director of Product Management at Denodo. He's been working with data virtualization for over 15 years, covering a variety of roles ranging from code development to sales engineering. He currently leads the Product Management team at Denodo, and is responsible... Read More →



Wednesday December 6, 2023 10:20am - 10:30am PST
Hahn Auditorium
  Sponsored Keynote Sessions
  • Presentation Slides Attached Yes

10:30am PST

Break
Coffee + Tea
- Peerless Organic Fair Trade French Roast and Swiss Water
- Process Decaf, assorted Numi teas served with condiments,
*1/2 n 1/2, 2% milk, oat milk
-spring water
-assorted sodas
-sparkling water and flavored sparkling water

Vegan Parfait
chia, coconut and fresh berry
(vegan, GF, dairy free)

Assorted Baked Goods
Danish pastries, muffins, croissants and bagels with cream cheese, jelly, creamery butter
-Displayed with toasters for guests

Wednesday December 6, 2023 10:30am - 11:00am PST
Grand Hall

11:00am PST

Sponsored Keynote: Building Real-time Data Apps for Presto Open Data Lake using modern APIs - Srini Gurrapu & Alonso Vega, Bhuma
The growth of Open Data Lake architectures founded on PrestoDB allows us to imagine and build modern “real-time data apps” that deliver “actionable insights” to facilitate business outcomes - across all data sources in the organization.

This session explores different low-code approaches to build real-time data apps on the PrestoDB data lake using modern APIs such as REST and GraphQL connectors to the federated SQL. In particular, the session focuses on the new JS Connector - and some additional features, such as the Presto orchestrator to manage the query prioritization and deeper insights into the runtime query analytics.

Speakers
avatar for Srini Gurrapu

Srini Gurrapu

Founder & CEO, Bhuma.dev
avatar for Alonso Vega

Alonso Vega

Software Architect, Bhuma



Wednesday December 6, 2023 11:00am - 11:10am PST
Hahn Auditorium
  Sponsored Keynote Sessions
  • Level Any
  • Presentation Slides Attached Yes

11:10am PST

Keynote: Presto at IBM - Leaning into Open Source for the Open Data Lakehouse - Vikram Murali, IBM
In this keynote, Vikram will share more about Presto at IBM. He will discuss why IBM chose Presto to power watsonx.data, the open data lakehouse at IBM and why open source in general is critical to watsonx.data. He’ll also highlight the key areas and features within Presto that the IBM Data & AI team is working on and contributing back to the open source project, including optimizer and performance work.

Speakers
avatar for Vikram Murali

Vikram Murali

Vice President, Development - Data and AI Software, IBM
BackgroundVikram Murali is Vice President of Engineering for Hybrid Data Management within IBM Data and AI. Hybrid Data Management encompasses development and support for Db2 Distributed, Netezza, Cloud Pak for Data Systems, BigData partnerships as well as hosted and managed Cloud... Read More →



Wednesday December 6, 2023 11:10am - 11:40am PST
Hahn Auditorium
  Keynote Sessions
  • Presentation Slides Attached Yes

11:40am PST

Sponsored Keynote: Future-Forward Performance Enhancements (watsonx.data) - Satya Krishnaswamy, IBM
During this keynote presentation, Satya will provide an in-depth exploration of IBM's forthcoming strategic investments, meticulously designed to elevate query performance within the watsonx.data. This engaging session will spotlight preliminary performance results directly from IBM Labs, offering a comprehensive demonstration of how each targeted investment area contributes incrementally to amplify the overall query efficiency.

Speakers
avatar for satya krishnaswamy

satya krishnaswamy

Program Director for Development - IBM Data & AI, IBM
Satya Krishnaswamy has 15 years of diverse experience in Product Development and Support. With expertise in managing Db2 warehouse/distributed domains encompassing Runtime, Storage, Connectivity, Spatial, and appliance business, he has played a pivotal role in shaping these areas... Read More →



Wednesday December 6, 2023 11:40am - 11:50am PST
Hahn Auditorium
  Sponsored Keynote Sessions
  • Presentation Slides Attached Yes

11:50am PST

Lunch
Salmon a la Veracruzana
grilled salmon, roasted tomato and garlic salsa, over cilantro rice

Baja Grilled Chicken Bowl
baja grilled chicken, cilantro rice, seasoned pinto beans, pico de gallo, shaved cabbage
*Pico de gallo on the side

Vegan Fiesta Bowl
grilled nopales, ancho chili purée, sweet corn, cilantro rice, seasoned pinto beans, and roasted tomato salsa
(GF-Vegan)

Tortilla Chips
Individual servings

Mexican Caesar Salad
grilled hearts of romaine, avocado Caesar dressing, pepitas, grilled corn (Vegan)

Mexican Chocolate Pots de Creme
with cinnamon and a hint of chili topped with candied pepitas and a fresh raspberry



Wednesday December 6, 2023 11:50am - 12:45pm PST
Grand Hall

12:45pm PST

Introduction to Prestissimo - Aditi Pandit, IBM (Ahana team)
Prestissimo is the project code-name for the new C++ Native Worker. The new Native worker is a replacement for the Java worker in Presto clusters. Prestissimo is a state of the art query processing engine. It builds on the open-source Velox project. Velox is a library of re-usable data processing primitives. The primitives include new aggregation, join, window operators, Hive connector as well as native Parquet, Iceberg readers. The Native engine has many benefits : i) Huge Performance boost on account of use of vectorization, SIMD and sophisticated adaptive runtime optimizations. ii) More predictable behavior because of the memory management framework that provides better accounting and features like spilling and memory arbitration. This offers the engine more explicit memory control than Java GC. iii) Better operational guidance. This talk will give insight into how to use Prestissimo and understand its query processing behavior.

Speakers
avatar for Aditi Pandit

Aditi Pandit

Principal Software engineer, IBM/Ahana
Aditi is a Principal Engineer at Ahana/IBM. She is a Prestissimo committer in the Presto and Velox Open source projects. Aditi has an extensive data infrastructure career of ~18 years at Google, Teradata Aster and Informatica before joining Ahana/IBM. She has a MS in Computer Science... Read More →



Wednesday December 6, 2023 12:45pm - 1:15pm PST
Hahn Auditorium
  Prestissimo Track, Session Presentations
  • Level Any
  • Presentation Slides Attached Yes

12:45pm PST

Presto Query Governance at Uber - Yasaman Samei, Gurmeet Singh & Hitarth Trivedi, Uber
The talk describes the production best practices at Uber and the infrastructure that we have developed and used to govern the Presto query workload.

Speakers
avatar for Hitarth Trivedi

Hitarth Trivedi

Software Engineer, Uber
Working on Presto @ Uber
avatar for Yasaman Samei

Yasaman Samei

SW Eng, Uber
 Data @ Uber
avatar for Gurmeet Singh

Gurmeet Singh

Staff Engineer, Uber
Have been in the Uber Presto team for around 3+ years. Before that I have worked mostly in the systems/storage domain in few big companies and startups.



Wednesday December 6, 2023 12:45pm - 1:15pm PST
Boole Room
  Presto Track, Session Presentations

1:20pm PST

A Journey of Evaluating Prestissimo Against PrestoDB on TPC-DS - Shengxuan Liu & Changyang Gu, ByteDance
Changyang Gu, Shegnxuan Liu from ByteDance will present how the ByteDance Presto team achieved 1.7X performance gain on Prestissimo TPC-DS end-to-end tests, compared with PrestoDB. The talk covers how the team diagnosed Prestissimo's Hash Aggregation and Hash Join performance issues, and their new implementation aiming at better performance. They will also share the local TPC-DS testing suites and how it contributes to their development.

Speakers
avatar for Changyang Gu

Changyang Gu

Software Engineer, ByteDance Inc.
Changyang Gu, currently a software engineer at Bytedance, specializes in big data OLAP engines. His expertise lies in data engine acceleration and optimization, and he has collaborated closely with the Presto and Velox communities. Previously, he was part of the IBM DB2 database team... Read More →
avatar for Shengxuan Liu

Shengxuan Liu

Software Engineer, ByteDance Inc.
Shengxuan Liu is a software engineer from ByteDance and focuses on big data OLAP processing engines. He works closely with PrestoDB, Alluxio and Velox communities. Prior to ByteDance, he was a software engineer in Oracle and received his Master's in Computer Science from Rensselaer... Read More →



Wednesday December 6, 2023 1:20pm - 1:50pm PST
Hahn Auditorium
  Prestissimo Track, Session Presentations
  • Level Any
  • Presentation Slides Attached Yes

1:20pm PST

Statistics with Sampling Using Iceberg on Presto - Zac Blanco & Xiuwen Zheng, IBM
In this presentation we'll highlight some of the work that's been done to create lightweight tables composed of samples of rows from much larger tables. The sample tables are used to infer statistics in large tables without having to resort to full table scans. The statistics aid the optimizer in order to make better decisions during Presto's query planning phase.

Speakers
avatar for Xiuwen Zheng

Xiuwen Zheng

Senior Back End Developer Intern, IBM
I am a fourth year Ph.D. student in Computer Science, supervised by Dr. Amarnath Gupta and Prof. Arun Kumar. My research interest includes data-driven DBMS, polystore database, machine learning systems, and AI for DB.
avatar for Zac Blanco

Zac Blanco

Software Developer, IBM
Zac is an engineer at IBM working on the Presto query optimizer team.



Wednesday December 6, 2023 1:20pm - 1:50pm PST
Boole Room
  Presto Track, Session Presentations

1:55pm PST

Optimizers of Meta Scale - Feilong Liu, Meta
In the last year, we added several optimization rules based on the sub-optimal query patterns found in workload within Meta. These optimizations are mainly query rewrites to rewrite inefficient queries to produce more efficient plans, which include mitigating data skew in query execution, getting rid of inefficient cross joins, speeding up speed for certain operations etc. In this talk, I will go through these optimizations added.

Speakers
avatar for Feilong Liu

Feilong Liu

Research Scientist, Meta
I am a Research Scientist working in the Presto team in Meta, before that I got my PhD degree majoring in database in The Ohio State University.



Wednesday December 6, 2023 1:55pm - 2:05pm PST
Boole Room
  Presto Track, Lightning Talks
  • Level Any
  • Presentation Slides Attached Yes

1:55pm PST

Presto-Vector Database Connector – Bringing Vector Search to Presto - Nasrullah Sheikh, IBM Research & Berthold Reinwald, IBM
Vector search is a critical component in AI-powered applications such as semantic search and Retrieval Augmented Generation (RAG). Several libraries such as FAISS, Annoy; databases such as Milvus, Chroma lack native SQL support and cannot support hybrid queries on both structured and unstructured data. To overcome these limitations and run hybrid queries on data stored in a data LakeHouse, it is imperative to have vector search capability natively built into SQL. We build a Presto Connector which enables vector search on the data indexed by vector databases such as FAISS. The connector follows the semantics of schema and tables. The vectors are stored and indexed by vector database and queried through Presto. Thus, facilitating the execution of hybrid queries in the Presto Lakehouse using SQL semantics. In this talk, we will discuss the technical details of the Presto vector database connector followed by a brief demo.

Speakers
avatar for Berthold Reinwald

Berthold Reinwald

Principal Research Scientist, IBM
Berthold is a researcher in data management & AI, working on Data Lakehouses at IBM Research.
avatar for Nasrullah Sheikh

Nasrullah Sheikh

Research Scientist, IBM Research
I have 4 years of experience working as research scientist, focussing on essential research in the fields of artificial intelligence and data management, specifically focusing on graph machine learning.



Wednesday December 6, 2023 1:55pm - 2:05pm PST
Hahn Auditorium
  Presto Track, Lightning Talks
  • Level Any
  • Presentation Slides Attached Yes

2:10pm PST

Accelerating ElasticSearch Through Velox - Sungho Park, ByteDance
Sungho Park from ByteDance will present how they were able to leverage Velox's library to accelerate queries in a non-SQL-compatible database written in a non-native language. The talk will cover how Velox, as a C++ library, was exposed and utilized in Java from ElasticSearch, handling memory management across languages. To handle the non-SQL engine ElasticSearch, we have been able to convert the query into a SQL engine using Calcite and Velox to handle the query plan building and execution. It also will show the versatility of Velox in handling multiple cases of ElasticSearch, such as high QPS cases or intensive join queries.

Speakers
avatar for Sungho Park

Sungho Park

Software Engineer, ByteDance
Sungho Park has been working at ByteDance as a software engineer for 3 years, focusing on OLAP processing engines. Previously, Sungho attended U.C. Berkeley, studying both Computer Science and Statistics.



Wednesday December 6, 2023 2:10pm - 2:40pm PST
Hahn Auditorium
  Prestissimo Track, Session Presentations
  • Level Any
  • Presentation Slides Attached Yes

2:10pm PST

Learned Query Optimization in PrestoDB - Nesime Tatbul & Dave Cohen, Intel; Ryan Marcus, University of Pennsylvania; Christoph Anneser, Technical University of Munich
As part of a research collaboration between Intel and Meta, we developed AutoSteer - an ML-based solution that automatically drives query optimization in any SQL database that exposes tunable optimizer knobs. AutoSteer builds on the Bandit optimizer (Bao) developed by Intel Labs and MIT DSAIL, and extends it with new capabilities to facilitate usability in disaggregated SQL systems such as PrestoDB. We successfully applied AutoSteer on PrestoDB (via both generic and custom connectors), and conducted a detailed experimental evaluation with both public benchmarks and a production workload from Meta’s PrestoDB deployments. We achieved up to 40% improvement in query performance vs. PrestoDB's native query optimizer, and contributed a new optimizer heuristic currently in use by Meta. AutoSteer comes with a visual frontend and is available open source. At PrestoCon, we would like to share our findings with the broad PrestoDB community to get feedback and promote future collaboration.

Speakers
avatar for Dave Cohen

Dave Cohen

Sr Principal Engineer, Intel
Dave has been working on large-scale, distributed data systems for over 30 years.
avatar for Nesime Tatbul

Nesime Tatbul

Sr. Research Scientist, Intel Labs and MIT
avatar for Christoph Anneser

Christoph Anneser

Mr., Technical University of Munich
Optimizing database systems adaptively at runtime.
avatar for Ryan Marcus

Ryan Marcus

Assistant Professor, University of Pennsylvania
An assistant professor at UPenn, Ryan researches applications of machine learning to data systems, as well as learned and instance optimized systems.



Wednesday December 6, 2023 2:10pm - 2:40pm PST
Boole Room
  Presto Track, Session Presentations
  • Level Any
  • Presentation Slides Attached Yes

2:45pm PST

Case Study: Batch-Size in Velox Aggregation - Shiyu Gan, Bytedance
In this talk, the speaker will present how an accidental discovery of the difference batch-size makes in aggregation led to an investigative journey that resulted in two HashAggregation optimizations, as well as an appreciation of the importance of efficient memory management. It is the speaker's wish that the techniques and caveats covered in this talk will be educational to folks in the Velox community, if not the broader Presto community. Audiences will walk away with solid insights into how to analyze performance deltas and formulate hypothesis and validate their theories through experimentation. This talk will go some way toward raising awareness about Velox (the new Native Presto Worker), shedding light on the engineering investment into this project and exciting the broader Presto community about the type of improvements that are constantly incorporated into Velox.

Speakers
avatar for Shiyu Gan

Shiyu Gan

Software Engineer, ByteDance
Shiyu is a Software Engineer at ByteDance. He previously worked at Google and Microsoft in infrastructure and low-level systems. He was one of the core engineers that productionized the virtualization technology that enabled SQL Server on Linux.



Wednesday December 6, 2023 2:45pm - 2:55pm PST
Hahn Auditorium
  Prestissimo Track, Lightning Talks

2:45pm PST

Presto Express - Mingjia Hang, Uber
Presto express is a sub-project under Presto Governance in Uber. Currently, P50 of scheduled query execution time is under a minute but queries could wait in the queue for a very long time due to cluster congestion. This innovative endeavor leverages historical data to predict upcoming query execution times and optimizing cluster routing. Basically we used the historical data of fingerprints to estimate the up-coming queries and rout them to the specific clusters. It would help reduce the queue time of short running queries and increase the throughput of the system.

Speakers
avatar for Mingjia Hang

Mingjia Hang

Software Engineer, Uber
Jane is a software engineer at Uber where she works on the Presto interactive analytics team. Previously she worked in at Palo Alto Networks, where she focused on vast volumes of data ingestion and data interpretation



Wednesday December 6, 2023 2:45pm - 2:55pm PST
Boole Room
  Presto Track, Lightning Talks

2:55pm PST

Break
Coffee + Tea
- Peerless Organic Fair Trade French Roast and Swiss Water
- Process Decaf, assorted Numi teas served with condiments,
*1/2 n 1/2, 2% milk, oat milk
-spring water
-assorted sodas
-sparkling water and flavored sparkling water

Assorted Gourmet Chips

PM Snacks
dried fruit with candied and roasted seeds
individually wrapped chocolate and hard candy

Wednesday December 6, 2023 2:55pm - 3:15pm PST
Grand Hall

3:15pm PST

Revolutionizing Data Analytics - Open-Source Hardware Acceleration Support for Presto and Velox - Krishna Maheshwari, NeuroBlade
Explore the integration of open-source support for hardware (HW) acceleration into Presto and Velox, revealing its transformative potential in data analytics. Learn how HW acceleration support elevates data analytics efficiency, providing insights into the future of the field. We'll detail our collaboration with the Velox community to open-source HW acceleration support, which promises over a 30x performance improvement per dollar, eliminating the need to rearchitect data Lakehouse for achieving larger scales. Join us to unlock accelerated data analytics, shaping the future of your organization's data capabilities.

Speakers
avatar for Krishna Maheshwari

Krishna Maheshwari

CPO, NeuroBlade


Wednesday December 6, 2023 3:15pm - 3:45pm PST
Hahn Auditorium

3:15pm PST

No Speed Limit: Use Hudi, DBT and Presto for Blazing Fast Analytics - Nadine Farah, onehouse
Traditional analytics approaches often struggle to keep up, shackled by processing bottlenecks and cumbersome data pipelines. Hudi, DBT and Presto are designed to push the boundaries of data processing speeds, leading to blazing-fast analytics. To further enhance the efficiency of upsert operations, Hudi has introduced a new record-level index, improving speeds by orders of magnitude. This advancement enables Hudi to significantly accelerate computationally demanding MERGE operations, eliminating the need for full table scans. Building upon this foundation, DBT offers a unified framework that transforms this raw data into refined, trustworthy models. This streamlined data then becomes a fertile ground for Presto, which equips users with robust ANSI SQL capabilities. By combining these 3 technologies, engineers can ensure their analytics are at unprecedented velocities.

Speakers
avatar for Nadine Farah

Nadine Farah

Head of dev rel, Onehouse
Nadine Farah is leading Onehouse's developer initiatives. She's passionate about bridging engineering, product & marketing to help drive product adoption. She previously led Rockset's developer initiatives, focusing on building technical content to drive developer adoption for real-time... Read More →


Wednesday December 6, 2023 3:15pm - 3:45pm PST
Boole Room

3:50pm PST

Prestissimo: A Year In, The Path to Veloxification - Amit Dutta, Meta Platforms, Inc
Prestissimo, is an ambitious undertaking in replacing Presto’s Java worker with C++. We leverage Velox which is an open source execution engine to build the new C++ worker. In this talk we will briefly introduce Velox, its various components and discuss the benefits it brings to Presto. We present the high level design of Velox’s integration into Presto aka Prestissimo. Prestissimo has been running in production for over a year at Meta and we will use this opportunity to deep dive into our learnings that we’ve garnered from the Experimentation Platform workload. Veloxification is also underway for Spark as the Gluten project. Prestissimo is at the cutting edge of Presto’s performance push. We hope our learnings and findings of running Presto’s C++ worker in production will be of interest not only to those looking to push Presto’s performance to the next level but also to other engines considering veloxification.

Speakers
avatar for Amit Dutta

Amit Dutta

Software Engineer, Meta
Amit is working as a Software Engineer at Meta Platforms Inc. At Meta, he worked in design, development of multiple data warehouse query engine products, consolidation of query engines and generally improving reliability/scalability of complex systems. Prior to Meta, Amit completed... Read More →



Wednesday December 6, 2023 3:50pm - 4:20pm PST
Hahn Auditorium
  Prestissimo Track, Lightning Talks

3:50pm PST

Presto Optimization with Distributed Caching on Data Lake - Hope Wang & Beinan Wang, Alluxio
Presto users and developers often face challenges like slow, inconsistent query performance and high API and egress costs when using cloud storage like S3. In this talk, Beinan and Hope will share how to overcome these challenges using caching in Presto. They will discuss the distributed caching design with real-world examples. You will learn: - The challenges of data locality and query latency in Presto-powered data lakes - How to address these challenges through segmented data file caching, soft-affinity scheduler policies, cache filtering, TTL, and customized eviction - How Meta, Uber, ByteDance, and Newsbreak have used caching to optimize interactive queries, maximize cache hit rates, cut cloud storage costs, and accelerate queries - Best practices for setting up, using, and measuring TPC-DS benchmark results

Speakers
avatar for Beinan Wang

Beinan Wang

Software Engineer, Alluxio
Dr. Beinan Wang is a tech lead manager from Alluxio with extensive experience in data infrastructure. Prior to Alluxio, he was the Tech Lead of the Interactive Query team in Twitter and he built large scale distributed SQL systems for Twitter’s data platform. He has twelve-year... Read More →
avatar for Hope Wang

Hope Wang

Developer Advocate, Alluxio
Hope Wang is a Presto Contributor and a Developer Advocate at Alluxio. She has a decade of experience in Data, AI, and Cloud. An open-source contributor to PrestoDB, Trino, and Alluxio, she currently works at Alluxio as a developer advocate and previously worked in venture capital... Read More →



Wednesday December 6, 2023 3:50pm - 4:20pm PST
Boole Room

4:25pm PST

History Based Optimizer - Lyublena Antova, Meta
History Based Optimizer(HBO) makes query plans more efficient by learning from executions of similar queries in past. We dive into framework design and how several query optimizations benefit from it.

Speakers
avatar for Lyublena Antova

Lyublena Antova

Software Engineer, Meta
I am a Software Engineer @Meta, TL for Presto Query Optimizer I got my PhD in Computer Science from Cornell University. Before joining Meta I worked at several Database startups (Pivotal & Datometry).



Wednesday December 6, 2023 4:25pm - 4:55pm PST
Boole Room

4:25pm PST

Presto and Apache Iceberg - Ajay Gupte, IBM; Beinan Wang, Alluxio
Join us for a session on how Iceberg is transforming the data landscape with its innovative features, growing adoption, and open-source java toolkit. Learn how watsonx.data leverages Apache Iceberg to support multiple engines and deliver a seamless data experience.

Next, Beinan will present a case study showing how PrestoDB and Iceberg can accelerate AI/ML pipelines for computer vision use cases. We are working on enabling metadata and data management natively by Iceberg within the AI training process. By seamlessly integrating Iceberg with PrestoDB and AI/ML workflows, we can optimize the performance, flexibility, and governance of the entire data pipeline.

Speakers
avatar for Beinan Wang

Beinan Wang

Software Engineer, Alluxio
Dr. Beinan Wang is a tech lead manager from Alluxio with extensive experience in data infrastructure. Prior to Alluxio, he was the Tech Lead of the Interactive Query team in Twitter and he built large scale distributed SQL systems for Twitter’s data platform. He has twelve-year... Read More →
avatar for Ajay Gupte

Ajay Gupte

Software Engineer, IBM
Ajay is a senior software engineer in Netezza database team. He contributes to the open source projects of presto and apache iceberg connector.



Wednesday December 6, 2023 4:25pm - 4:55pm PST
Hahn Auditorium

5:00pm PST

Closing Session - Ali LeClerc, IBM
Speakers
avatar for Ali LeClerc

Ali LeClerc

PrestoCon Chair, Presto Foundation | IBM
Ali has over a decade of experience in open source software, product, and community marketing. She was most recently Head of Community at Ahana, the SaaS for Presto company, which was acquired by IBM. She works closely with the Presto Foundation to drive open source programs. Prior... Read More →


Wednesday December 6, 2023 5:00pm - 5:10pm PST
Hahn Auditorium

5:15pm PST

Reception
Displayed Hors d' oeuvres
Deconstructed Charcuterie and Cheese Board
an assortment of local artisan cured meats and salamis
with grilled vegetables, grapes, strawberries,
an array of local and international cheeses,
gourmet crackers, breadsticks and crostini

Dim Sum
an assortment of Asian delights to include
- Vegetarian Potstickers
- Steamed Pork Bun
- Sui Mai - served with sweet tamari sauce
sriracha and sambal chili paste
displayed in large bamboo baskets

Korean Pork Belly Slider
quick kimchi, bao bun + furikake Fries

Petite Dessert Display
miniature cheesecakes, assorted macarons

Avocado Chocolate Mousse
fresh whipped avocado topped with dark chocolate ganache

Premium Beer, Wine & Soda Bar
- Beer selections to include Lost Coast Tangerine, Deschutes
Fresh Squeezed IPA and Firestone 805, or similar.
- Wine selections to include Wente Morning Fog Chardonnay and
Wente Southern Hills Cabernet Sauvignon, or similar.
- Assorted Sodas, Spring Water and Sparkling Water.

Wednesday December 6, 2023 5:15pm - 6:30pm PST
Grand Hall
 
  • Timezone
  • Filter By Date PrestoCon 2023 Dec 5 - 6, 2023
  • Filter By Venue 1401 North Shoreline Boulevard, Mountain View, CA, USA
  • Filter By Type
  • Breaks & Networking
  • General Sessions
  • Keynote Sessions
  • Prestissimo Track
  • Presto Track
  • Registration
  • Sponsor Showcase
  • Sponsored Keynote Sessions
  • Workshops
  • Level
  • Presentation Slides Attached

Filter sessions
Apply filters to sessions.