Random(Notes): Open Source

Showing posts with label Open Source. Show all posts

Jenkins Pipeline

Introduction

I believe in simplicity. It means I think things should be simple to work with.
It doesn't mean that it always should be simple inside. It could be complex, sometimes very complex.

At the same time, I distinguish two types of complexity:

complexity caused by indirect and non-obvious relationships between components,
complexity caused by mess within components itself and messy relationships between them

First one means product could be implemented in a very smart way, built on large number of implicit assumptions, often non-obvious.

Second one means product's complexity is exaggerated by very confusing, illogical and messed up relationships between components. Unlike the first one, relationships exists, but they make no sense not because they are so over-smartly designed, but because they are spaghetti-like.

Often those two types could be met together in same product. Hope you'd never had a chance to deal with such.

The way to fix complexity type #1 is to remove implicit assumptions by adding smaller components with visible relationships.

However, to fix complexity type #2, you need to understand it deeply. What if thing is complicated b/c you can't understand it yet at this moment? To be able to answer this question, I always start with research. And my research has some kind of diagram. I believe that visualization is the best way to tackle complexity. At least it always works well for me.

Simply put, visualization is one of the best ways to simplify hard and complex things. Even if this is a very basic diagram, it is still better than nothing. Sometimes, it requires a time to find a correct form of visualization within correct level of abstraction, but once you did it, you are half done.

Continuous Deployment

And this post is about visualization; and how it can help to combine parts into a single simple picture. And more specifically, it is about how visualization can help simplifying continuous delivery.

Continuous delivery is a process of getting you product from source code and into production. It usually happens through a pipeline of jobs: first job is compiling source code and building artifacts, then goes integration testing , deploying to staging environment, and eventually to production. For example, in Jenkins job is usually created for each of these steps, where each job triggers next one once it's finished successfully. Fore example, there would be jobs like 'Build XYZ,' 'Test XYZ,' 'Deploy XYZ to Staging,' 'Deploy XYZ to 1-box' and 'Deploy XYZ to Production'.

Default Jenkins View would present those jobs as a list, with no visible relationships between those. But there are relationships between them. And actually all those jobs are here for the single most important goal: get new version into production for the customers! So relationships play important role, which is hidden from us as users.

You might not even feel it immediately, but this presentation of jobs list brings a complexity. There is a sense behind those jobs, but it is normally hidden from viewer, unless one ready to spend time to understand how things work.

Pipelines in Jenkins

But good thing is Jenkins already allows you to remove this complexity. And it's via visualization of the pipeline of jobs you've created.

This support is brought by "Build Pipeline Plugin". This plugin adds a new "View" type called "Build Pipeline View".

In a next step you would need to pick your first job in the pipeline.

Then pipeline would be created based on jobs dependencies: this pipeline will contain jobs that would be triggered by first job, and also jobs which are triggered by that jobs and so on.

And now once someone makes a change into XYZ source, a new job for "Build XYZ" would be triggered. Once this job is finished successfully, it will trigger "Text XYZ" and so on. As you see no functionality change happened here, but with pipelines it is possible to visualize both dependencies between those jobs and what is a current state of the CD process.

That makes things so simple to work with. You can understand build structure and current state with a single glance.

More to that, Jenkins 2.0 comes with built it support for pipelines.

Blue Ocean

I'd also like to mention the initiative called "Blue Ocean" which sets a goal to build a better visualization of pipelines in Jenkins. I'd be happy to see it in live one day.

Other Products

There are bunch of other products that would help you to create build / deployment pipeline:

AWS CodePipeline - https://aws.amazon.com/codepipeline/
Concoure CI - https://concourse.ci/
Bitbucket Pipelines - https://bitbucket.org/product/features/pipelines

Simple Multiple BloomFilters data structure implementation on Java

As a follow up on my previous post about using Multiple Bloom Filters data structure to identify hot values, I decided to write a simple dumb implementation on Java. And open source it.

The project can be found on GitHub and is proudly named multi-bloom-filter.

It comes with basic little class called MultiBloomFilter which does most of the job. It accepts the number of enclosed bloom filters, capacity of each BF and duration before the the head BF will be reset. One can also specify what hash function is used and how many times hash function should be applied for each value.

Simple example:

This short example shows that MBF will reset only one of the internal BFs. Means, whenever reset happens, it will remove only part of the data, and whenever the hot key is added again, it would be identified as such.

Once again, MBF is a great solution if you need to find a set of hot values for some period of time. In particular, this helps to put only hot values into the cache. If we have many hosts that use a single distributed cache service, then using MBF might save from redundant traffic of putting cold data into the cache, where they would be evicted pretty fast. Also, as hot keys are in MBF, means there is a high chance they are in the distributed cache as well. Thus application has some kind of "bloom filter" to check what is the chance that value could be found in the cache for specified key.

There are much more use cases for the MBF data structure. Being able to work in concurrent scalable environment is another "feature" that I love about BloomFilters and MultiBloomFilter in particular. For me, good implementation of BloomFilter, that is able to grow and scale correctly, has different mechanism to evict data and fight the false positives, sounds as a very useful service.

Twister: A Simple Way to Manage Your Scripts

Imagine an average project that has many scripts, each written using different practices, uses different argument names, different namings, does something similar to other script but a bit different etc. Sometimes there are so many scripts, that it's hard to find the one you really need at this very moment. Moreover, scripts have no standard location, often put in semi-random directories, so it's really hard to find them. And even more, many developers have similar scripts for different projects. Some scripts are in the PATH, others are relative to the project directory. The one in the PATH are named in odd manner, because different versions used for different projects. And some scripts are written using bash and ruby and python etc.

Oh, so many troubles just because of the scripts. That's the reason why Twister was created. Twister is a simple framework/tool that allows to create and manage project scripts. I use it in a few of my home projects, and find it very helpful. About a year ago, I open sourced Twister. It's a python project, so it makes it simple to create scripts and execute them on any popular operating system.

keep reading »

SyncTab is open sourced

As I wrote yesterday on my twitter, SyncTab had been open sourced. About an year ago I started working on the project that would simplify my life. I used a lot my new smartphone to browse internet and read RSS, but the most interesting articles I wanted to read in my browser on laptop. And I can be pretty lazy and its easy for me to forget things, so I wanted to be sure that I won't leave this links without attention. The only way I could think about was just open the link in browser as soon as possible.

keep reading »

Memory v0.1 is released

A few weeks ago I decided to create a simple library that emulates a virtual memory but with some extra features. My primary interest is to play with memory (de)allocation algorithms, replication algorithms, improve my knowledge in concurrency programming etc.

As for now, I have implemented a virtual memory with allocation mechanism, added concurrency and a basic transaction support.

Today I'm releasing a first version of this library, ie v0.1. This library can't do much yet, and if I'll have some time, I will start working on next version soon.

Library is open sourced, I'm hosting it on GitHub https://github.com/rkhmelyuk/memory. Documentation can be found in project's Wiki.

First memory should b allocated. There a few different allocators for this:

FixedMemoryAllocator - to allocate a memory of fixed size
DynamicMemoryAllocator - to allocate a memory of dynamic size (with initial and max size specified)
FileMemoryAllocator - to allocate a memory mapped to file, supports both fixed and dynamic allocation

After memory is allocated, it can be used in application. The common workflow is to allocate spaces in this memory. Space represents a block of memory. This block will be not available for allocation till a space is freed. A space is like a window to specified block of memory: it can be used to read and write data at this block.

Spaces are thread-safe. This means that locking is used when read/write operation happens. For a single write operation there is no change other thread will read a corrupted data. For a multiple sequence read and write operation it's possible to use a transactional space currently. In the next version I plan to add a lock/unlock operation, so single client can lock and use space without a chance a space to be changed by other thread.

A simple example for the end of this post:

// allocate a memory with a fixed size 20KB
FixedMemoryAllocator allocator = new FixedMemoryAllocator();
Memory memory = allocator.allocate(20 * Memory.MB);

// allocate a space in the memory with a size 10KB
Space space = memory.allocate(10 * Memory.KB);

// write a string to space memory
space.write("Hi world");

// create a transactional space, and start a transaction
TransactionalSpace transactional = space.transactional();
transactional.start();
try {
    transactional.write("Hello");
    transactional.write(transactional.readString() + " world!");

    // in transaction - original space is not changed
    assert space.readString().equals("Hi World");

    // but transactional does
    assert transactional.readString().equals("Hello world!");

    // commit transaction
    transactional.commit();
}
catch (Exception e) {
    // rollback transaction
    transactional.rollback();
}

// check transaction was committed correctly
assert space.readString().equals("Hello world!");

// it's safe to use in multiple threads
executor1.processData(space.getInputStream());
executor2.processData(space.getInputStream());

MOBI version of The Architecture of Open Source Applications book

The Architecture of Open Source Applications is a great book where one can find the description of the architecture of the 25 open source applications, like Eclipse, LLVM, Mercurial, HDFS and Berkley DB. This book is free to read in the HTML format online, but if you want to get a PDF or Kindle version, you'll need to buy it either at Lulu.com or at Amazon.com.

Of course the recommended way is to buy the book, as all royalties from these sales will be donated to Amnesty International. But you can download a MOBI version for Kindle of this book for free, just continue reading.

First, want to tell a short story. After I found a free SICP version for Kindle compiled from HTML files at github.com/twcamper/sicp-kindle, I was looking for a chance to create something like that myself. And then I found the AOSA book. It has an HTML version, and I wanted to have a MOBI version to read it in my Kindle.

And so I made a Kindle version of The Architecture of Open Source Applications book and currently it's available from github.com/rkhmelyuk/aosa-mobi.

Download a Kindle version of the AOSA

Open Source: core library

Few months ago, after I moved to GitHub, one of my util projects was open sourced. It was, simply called, core: https://github.com/rkhmelyuk/core. I have written this library few years ago and was actively using on Java projects. In some places it crosses with Apache Commons Lang library, but I, actually, like my child more :)

I'd like to describe some key classes and show samples of use. Some of them are enumerated below.

StringUtils contains few oftenly used methods, like:

isEmpty(), isBlank(), isNotEmpty(), isNotBlank(), isBlankTrimmed(), isNotBlankTrimmed() - used to check whether string is empty or not. The difference between empty and blank, is that empty is either null or empty string, while blank is always blank string and can't be null.

cut() - used to cut string if length is more than specified and appends with specified suffix. The only difference is that this method tries to split by space, so don't cut the word. Sample:
String string = "some string goes here"; assertEquals "some string...", StringUtils.cut(string, 15, "...");
trimIfNotNull() - if input string is not null, then trim it and return result:
String string = " Hello "; assertEqual "Hello", StringUtils.trimIfNotNull(string); assertNull StringUtils.trimIfNotNull(null);
replaceNotAlphaNumeric() - replace all characters that are not letter or digit with specified one or "_" by default.

ConversionUtils contains some simple but useful methods to convert string value to numeric and boolean types. Contains methods getInteger(), getLong(), getBoolean(), getDouble(), getDate(), getFloat():


   assertEquals 1, ConversionUtils.getInteger("1");
   assertNull ConversionUtils.getInteger("hello");
   assertEquals 5, ConversionUtils.getInteger("hello", 5);

KeyGenerator was created to generate API keys, passwords and other random stuff. It has one highly configurable method and few helpful methods that uses it. There is a way to generate keys with alpha and/or numeric and/or special symbols.

That would be hard to write assertions for samples, but here are simple use cases:


  KeyGenerator.generateKey(10, KeyGenerator.WITH_ALPHA_LOW | KeyGenerator.WITH_ALPHA_UP);
  KeyGenerator.generateStrongKey(100);
  KeyGenerator.generateSimpleKey(20);
  KeyGenerator.generateAlphaKey(20);

After I found some issues with Apache Commons-Lang ToStringBuilder, I wrote my own replacement, and called it... ToStringBuilder :) It is very simple in use:


   class Blog {
      private String name;
      private String author;
      private int year;

      public String toString() {
          new ToStringBuilder(Blog.class)
              .field("name", name)
              .field("author", author)
              .field("year", year)
              .toString();
      }
   }

   Blog blog = new Blog();
   blog.setName("Java UA");
   blog.setAuthor("Ruslan Khmelyuk");
   blog.setYear(2010);

   assertEquals "Blog[name=Java UA, author=Ruslan Khmelyuk, year=2010]", blog.toString();

There are much more interesting tools, like ArgumentAssert and StateAssert used to assert arguments and program state respectively.

CollectionUtils also contains few useful methods, and I'm not going to describe them here.

Library is open to review and use. Still it's definitely not the best one and, I think, has value only for me.