Facebook F8 Developer Conference

Every year, Facebook hosts its F8 Developer Conference, where the company shows off its latest features, highlights development plans for the coming year, and connects with the thousands of businesses that interact with the platform every day. Koddi sent two representatives to San Jose, California last week in order to learn first-hand how to best leverage the platform and drive …

What is Immutable Infrastructure? Simply put, your hardware stack is created and maintained using the programming concept of immutability: once something is instantiated, it is immutable and does not change. If an update is needed (either from a scheduled upgrade or bug), a new instance is created to replace the existing one. Thus, once a component is launched it is …

In our platform we often have to fetch data from various locations (e.g. S3, SFTP, API) and in various formats (CSV, TSV, JSON, XML) because we have an incredibly diverse client and publisher catalog and each one provides their data in their own unique way. As we have grown over time, we’ve amassed a large list of microservices, processes, and configuration that handle these different data sources and files. The biggest issue that we’ve run into with these services is that the various portions of the data pipeline do not interact as well as we would like, so if there are any errors in that process for any reason, it can be difficult to track down where it is at times. We have begun to feel some strain from this, so we’re abstracting and centralizing as much as we can.

In the past years, we’ve seen an explosion of chat bots across multiple industries. Many times we are asked what can a chat bot do, and how would it benefit our product? In our experience, chat bots need to be tailored specifically to what a client would want otherwise, there is a very generic feeling to these bots (much like calling into an automated call center). So how can we make a bot succeed in an area crowded with thousands of existing bots?

At Koddi we’re always looking for ways to increase the speed and stability of our platform. One of our latest projects is speeding up our daily ingestion of data.

All of our data is initially stored in flat files on S3 before being loaded into our database. We’re currently in the process of integrating Apache Spark into our load process to drastically increase the speed of our loads. One problem we ran into is that S3 doesn’t behave like a normal file system in terms of read and write speeds. This is where Alluxio comes in. Alluxio is a “memory speed virtual distributed storage system” which lies between frameworks (such as Spark, MapReduce, Flink, etc.) and a storage system (Amazon S3, Google Cloud Storage, HDFS, Ceph, etc.). This allows for dramatically faster data access, with some users seeing a 30x increase in data throughput. For a more in-depth overview of Alluxio, see their documentation.

Every engineer out there is looking to build something amazing, just like every visionary likes to see their ideas come to life. Unfortunately, innovation can be lost in the day to day, technicalities, other priorities, and requirements documents. All of these have created pitfalls for many promising projects, but it doesn’t have to be the case if you can be aware of where those pitfalls may pop up and implement a little bit of autonomy in bridging those gaps.

Here are a few things that we do to keep our engineering team connected to and at the forefront of innovation.