Showing posts with label random forest. Show all posts
Showing posts with label random forest. Show all posts

Friday, December 13, 2019

Java Microservices: What do you need to tweak to optimize throughput and response times

Performance tuning usually goes something like followed:
  • a performance problem occurs
  • an experienced person knows what is probably the cause and suggests a specific change
  • baseline performance is determined, the change is applied, and performance is measured again
  • if the performance has improved compared to the baseline, keep the change, else revert the change
  • if the performance is now considered sufficient, you're done. If not, return to the experienced person to ask what to change next and repeat the above steps
This entire process can be expensive. Especially in complex environments where the suggestion of an experienced person is usually a (hopefully well informed) guess. This probably will require quite some iterations for the performance to be sufficient. If you can make these guesses more accurate by augmenting this informed judgement, you can potentially tune more efficiently.

In this blog post I'll try to do just that. Of course a major disclaimer applies here since every application, environment, hardware, etc is different. The definition of performance and how to measure it is also something which you can have different opinions on. In short what I've done is look at many different variables and measuring response times and throughput of minimal implementations of microservices for every combination of those variables. I fed all that data to a machine learning model and asked the model which variables it used to do predictions of performance with. I also presented on this topic at UKOUG Techfest 2019 in Brighton, UK. You can view the presentation here.

Monday, March 27, 2017

Machine learning: Getting started with random forests in R

According to Gartner, machine learning is on top of the hype cycle at the peak of inflated expectations. There is a lot of misunderstanding about what machine learning actually is and what it can be done with it.

Machine learning is not as abstract as one might think. If you want to get value out of known data and do predictions for unknown data, the most important challenge is asking the right questions and of course knowing what you are doing, especially if you want to optimize your prediction accuracy.

In this blog I'm exploring an example of machine learning. The random forest algorithm. I'll provide an example on how you can use this algorithm to do predictions. In order to implement a random forest, I'm using R with the randomForest library and I'm using the iris dataset which is provided by the R installation.