In my previous blog, I discussed the framework we have built for testing large scale machine log data. In this post, I will share results of our test. Every test run with varying server clusters was executed through our automation framework, and therefore had minimal effort on our side except clicking a button after deciding the number of servers we need.
The scalability of our platform is broadly dependent on 2 things:
While we run must-run performance tests for every release to make sure that the new features being added does not impact the performance of our platform, we (Engineering @ Glassbeam) wanted the ability to run large scale tests periodically too. However, running large scale performance tests is a time consuming and expensive affair. Such tests require spinning up tens of machines which can take days to setup and run the tests.
In modern healthcare, medical imaging devices such as X-ray, Computed Tomography(CT), Ultrasound, and Magnetic Resonance Imaging (MRI) devices play a critical role. These devices allow healthcare professionals to examine the patient and determine the root cause of their symptoms. Imaging devices allow healthcare providers to develop the right treatment plan for their patients.
Opening our platform for our customers and letting them build on top of it had been on our minds for some time now. With this capability now built into our platform, it throws open a variety of possibilities for our customers. In a two part series, I will explain the why and how of it.
Domain specific languages help a lot in improving developer productivity. First thing which you need while creating a DSL is a parser which can takes a piece of text and transforms it in structured format (like Abstract Syntax Tree) so that your program can understand and do something useful with it. DSL tends to stay for years so while choosing a tool for creating parser for you DSL you need to make sure that it’s easy to maintain and evolve the language.
Continuing our discussion on Edge Computing and Analytics ….. Remember WE SAID that a key benefit of Edge was Local Decision Making. Typically, that will preclude access to the install base data. However, there is a wealth of information which can be gleaned from the install base data (such as machine learning output). It seems a shame to not be able to utilize that on the edge.
As the Internet of Things inevitable starts coming into it’s own, the origin of data has evolved from people to machines to “things”. Technologies emerged from leaders like Google and Facebook to enable analyzing tons of data in massive data farms deployed in the cloud. All that is well and good, but the approach itself needed moving this “ton” of data to a central location, partition it across large number of nodes so that analysis could be parallelized. Imagine, Netflix has over 1,000 nodes in their cluster. Hmmmm, doable, but at some point the laws of physics start to interfere.
Ever wonder how we power those “which controller went down today” queries that sprawl 1000s of databases, amounting to 100s of terabytes of log data every day? How do we deal with terabytes of data in a robust and efficient manner? We call it harmonic in memory query management.
We’ve been working with a distributed Cassandra cluster for almost a year. During that time, we have learned a bit about achieving scalability, and along the way we have collected some insight on achieving optimal query performance.
Big data Applications are no longer a nice to have, but must have for many organizations.Many enterprises are already using massive data being collected in their organizations to understand and serve their customers better. Ones which have not yet learned how to use it will over a period of time, be left behind. Companies are now looking for platforms which not only provide them analytical capabilities over their data, but also help them become PROACTIVE AND PREDICTIVE. And hence machine learning capability is becoming an important component for any analytical platform.