Testing 1 TB/day Data Ingestion in a Few Hours and Just One Click! – Part 2

Bharadwaj Narasimha
Mar 01, 2018

In my previous blog, I discussed the framework we have built for testing large scale machine log data. In this post, I will share results of our test. Every test run with varying server clusters was executed through our automation framework, and therefore had minimal effort on our side except clicking a button after deciding the number of servers we need.

The scalability of our platform is broadly dependent on 2 things:

Testing 1 TB/day Data Ingestion, in few Hours and Just One Click!

Venkata Sai Vamsi
Feb 23, 2018

While we run must-run performance tests for every release to make sure that the new features being added does not impact the performance of our platform, we (Engineering @ Glassbeam) wanted the ability to run large scale tests periodically too. However, running large scale performance tests is a time consuming and expensive affair. Such tests require spinning up tens of machines which can take days to setup and run the tests.

Machine Learning Applications for Medical Imaging Devices

Mohammed Guller
Feb 16, 2018

In modern healthcare, medical imaging devices such as X-ray, Computed Tomography(CT), Ultrasound, and Magnetic Resonance Imaging (MRI) devices play a critical role. These devices allow healthcare professionals to examine the patient and determine the root cause of their symptoms. Imaging devices allow healthcare providers to develop the right treatment plan for their patients.

Build vs Buy for Log Analytics - How Glassbeam suits both

Pramod Sridharamurthy
Feb 11, 2018

The need to explain why one has to consider mining their logs is no longer relevant with so many advantages of log analytics. If you are still thinking whether mining your machine logs is important, read these two documents from Harbor Research: The Internet of Things, Machines and Data Analyticsor Product Analytics & Intelligence or just google and you will get compelling reasons.

Integrating Apache Kafka with Glassbeam – Behind the scenes: Opening up the Platform for Integrating with other Data Stores

Bharadwaj Narasimha
Feb 02, 2018

Opening our platform for our customers and letting them build on top of it had been on our minds for some time now. With this capability now built into our platform, it throws open a variety of possibilities for our customers. In a two part series, I will explain the why and how of it. 

Creating DSL with Antlr4 and Scala

Pramod Sridharamurthy
Jan 11, 2018

Domain specific languages help a lot in improving developer productivity. First thing which you need while creating a DSL is a parser which can takes a piece of text and transforms it in structured format (like Abstract Syntax Tree) so that your program can understand and do something useful with it. DSL tends to stay for years so while choosing a tool for creating parser for you DSL you need to make sure that it’s easy to maintain and evolve the language.

Actionable feedback right through edge computing

Apr 07, 2016

Continuing our discussion on Edge Computing and Analytics ….. Remember WE SAID that a key benefit of Edge was Local Decision Making. Typically, that will preclude access to the install base data. However, there is a wealth of information which can be gleaned from the install base data (such as machine learning output). It seems a shame to not be able to utilize that on the edge.

Glassbeam edge computing – a primer

Feb 22, 2016

As the Internet of Things inevitable starts coming into it’s own, the origin of data has evolved from people to machines to “things”. Technologies emerged from leaders like Google and Facebook to enable analyzing tons of data in massive data farms deployed in the cloud. All that is well and good, but the approach itself needed moving this “ton” of data to a central location, partition it across large number of nodes so that analysis could be parallelized. Imagine, Netflix has over 1,000 nodes in their cluster. Hmmmm, doable, but at some point the laws of physics start to interfere.

Refining the art of query performance

Oct 26, 2014

Ever wonder how we power those “which controller went down today” queries that sprawl 1000s of databases, amounting to 100s of terabytes of log data every day? How do we deal with terabytes of data in a robust and efficient manner? We call it harmonic in memory query management.

We’ve been working with a distributed Cassandra cluster for almost a year. During that time, we have learned a bit about achieving scalability, and along the way we have collected some insight on achieving optimal query performance.

Scalable machine learning with apache spark and mlbase

Jul 11, 2014

Big data Applications are no longer a nice to have, but must have for many organizations.Many enterprises are already using massive data being collected in their organizations to understand and serve their customers better. Ones which have not yet learned how to use it will over a period of time, be left behind. Companies are now looking for platforms which not only provide them analytical capabilities over their data, but also help them become PROACTIVE AND PREDICTIVE. And hence machine learning capability is becoming an important component for any analytical platform.