Understanding and adopting Splunk

Splunk has been a trend in the industry for quite some time, but what do we know about its use and the market Splunk is targeting?

Splunk comes from the word “spelunking”, which refers to the activities of locating, exploring, studying and mapping.

  1. Data indexing: Splunk collects data from different locations, combines them and stores them in a centralized index.
  2. Using indexes for searches: the use of indexes gives Splunk a high degree of speed when searching for problem sources.
  3. Filtering results: Splunk provides user with several tools for filtering results, for faster detection of problems.

For more than a year I have been experimenting with Splunk in several facets: security, storage, infrastructure, telecom and more. We at ESI have a very complete laboratory which allowed me to push my experiments.

In addition to using all these amounts of data, I used open data to experiment with Splunk’s ability to interpret them.

I tested the open data of the site “montreal.bixi.com”; this is raw data formatted as follows:

Start date –  Start station number –  Start station –  End date –  End station number –  End station –  Account type – Total duration (ms)

With this data, we are able to find the most common routes, estimate the average duration of a trip, the anchorage points most requested for the entry or exit of bicycles.

For the operations team of the service, this provides real-time or predicted for the next day which anchors should have more bicycles, and mostly where these bicycles will go. They could predict the lack or surplus of bikes in the anchor points. If data is collected in real-time, alerts could be issued to indicate potential shortage or surplus in the anchor points. Thus the system facilitates planning and allows to be proactive to meet demand, rather than reactive. We would even be able to detect an undelivered bicycle; for instance a bike that has not been anchored for more than 24 hours could issue an alert, so the operations team attempts to trace it.

For marketers, one might think this data is useless, while the opposite is true; the same data can be used to put in place potential offers to attract customers, since we have the data that give the time of departure and arrival, time of use of the trips, and the most used routes. One can thus know the most used time slots and make promotions or adjust the rates according to objectives of traffic or customer loyalty.

For the management, open data unfortunately does not give the price of races according to the status of the users (members or non-members), but the beauty of Splunk is that one can enrich the collected data with data coming from a third-party system, a database or simply manually collected data. Management could then obtain reports and dashboards based on various factors, such as user status, travel time, days of the week, and much more. We could even make comparisons with previous months or the same month of the previous year. The applications are virtually limitless with data that resides in Splunk: the only limitation is that of our imagination!

These are of course fictitious examples made with available open data, but which could be real with your own systems and data.

The collection of information from a website can provide visibility for all users of a company, operations receive system overload alerts, marketers get information about the origin of the connections to target their campaigns based on this data, management gets a view of the user experience, as well as performance metrics that confirm SLAs.

Whether it is security, operations, marketing, analytics or whatever, Splunk can address your needs. In addition to the 1,200 applications available in its portal, you can create your own tables, reports, or alerts. You can use their Power Pivot to allow people to easily use the data and build their own dashboard.

The platform is easy to use and does not require special expertise: you only need the data there.

Do not hesitate to contact ESI for a presentation or a demo; it will be my pleasure to show you how to “Splunk”.

Guillaume Paré
Senior Consultant, Architecture & Technologies – ESI Technologies

Account of the NetApp Insight 2016 Conference

The 2016 Edition of NetApp Insight took place in Las Vegas from September 26 to 29.
Again this year, NetApp presented its ‘Data Fabric’ vision unveiled two years ago. According to NetApp, the growth in capacity, velocity and variety of data can no longer be handled by the usual tools. As stated by NetApp’s CEO George Kurian, “data is the currency of the digital economy” and NetApp wants to be compared to a bank helping organizations manage, move and globally grow their data. The current challenge of the digital economy is thus data management and NetApp clearly intends to be a leader in this field. This vision is realized more clearly every year accross products and platforms added to the portfolio.

New hardware platforms

NetApp took advantage of the conference to officially introduce its new hardware platforms that integrate 32Gb FC SAN ports, 40GbE network ports, NVMe SSD embedded read cache and SAS-3 12Gb ports for back-end storage. Additionally, FAS9000 and AFF A700 are using a new fully modular chassis (including the controller module) to facilitate future hardware upgrades.

Note that SolidFire platforms have been the subject of attention from NetApp and the public: the first to explain their position in the portfolio, the second to find out more on this extremely agile and innovative technology. https://www.youtube.com/watch?v=jiL30L5h2ik

New software solutions

  • SnapMirror for AltaVault, available soon through the SnapCenter platform (replacing SnapDrive/SnapManager): this solution allows backup of NetApp volume data (including application databases) directly in the cloud (AWS, Azure & StorageGrid) https://www.youtube.com/watch?v=Ga8cxErnjhs
  • SnapMirror for SolidFire is currently under development. No further details were provided.

The features presented reinforce the objective of offering a unified data management layer through the NetApp portfolio.

The last two solutions are more surprising since they do not require any NetApp equipment to be used. These are available on the AWS application store (SaaS).

In conclusion, we feel that NetApp is taking steps to be a major player in the “software defined” field, while upgrading its hardware platforms to get ready to meet the current challenges of the storage industry.

Olivier Navatte, Senior Consultant – Storage Architecture

What about Big Data & Analytics?

After the “cloud” hype, here comes the “big data & analytics” one and it’s not just hype. Big data & analytics enables companies to make better business decisions faster than ever before; helps identify opportunities with new products and services and bring innovative solutions to the marketplace faster; assists IT and helpdesk in reducing mean time to repair and troubleshoot as well as giving reliable metrics for better IT spending planning; guides companies in improving their security posture by having more visibility on the corporate network and identify suspicious activities that go undetected with traditional signature-based technologies; serves to meet compliance requirements… in short, it makes companies more competitive! One simply has to go on Youtube to see the amazing things companies are doing with Splunk for example.

BIG-DATA-1I remember when I started working in IT sales in the mid 90’s, a “fast” home Internet connexion was 56k and the Internet was rapidly gaining in popularity. A small company owner called me and asked “What are the competitive advantages of having a website?” to which I replied “it’s no longer a competitive advantage, it’s a competitive necessity” and to prove my point I asked him to search his competitors out on the Internet: he saw that all of his competitors’ had websites!

The same can now be said of big data & analytics. With all the benefits it brings, it is becoming a business necessity. But before you start rushing into big data & analytics, know the following important facts:

  1. According to Gartner, 69% of corporate data have no business value whatsoever
  2. According to Gartner still, only 1.5% of corporate data is high value data

This means that you will have to sort through a whole lot of data to find the valuable stuff that you need to grow your business, reducing costs, outpacing competition, finding new revenue sources, etc. It is estimated that every dollar invested in a big data & analytics solution brings four to six dollars in infrastructure investments (new storage to hold all that priceless data, CPU to analyze, security for protection etc.).

So before you plan a 50,000$ investment in a big data & analytics solution and find out it comes with a 200,000$ to 300,000$ investment in infrastructure, you should talk to subject matter experts. They can help design strategies to hone in on the 1.5% of high value data, and reduce the required investment while maximizing the results.

Charles Tremblay, ESI Account Manager