Blog

Some useful .bashrc tips

When cleaning up my .bashrc file, I noticed some useful shortcuts that might be useful for others. Here they are:

SSH with .bashrc

Sometimes I need to SSH to a new machine and will need my .bashrc file there. This short function, transfers my local .bashrc file to the remote server and then does the SSH:


function xs() {
if [ "$#" -eq 0 ]; then
    echo "usage: xs user@host other_args"
    return
fi

host=$1

shift

echo "Setting up remote machine..."
scp $@ ~/.bashrc $host:/tmp/.bashrc_temp > /dev/null

echo "Connecting to $host"
ssh -A -t $@ $host "bash --rcfile /tmp/.bashrc_temp ; rm /tmp/.bashrc_temp"
}

Perl

Alias to invoke Perl debugger and a REPL for running perl code snippets:


alias perld='PERLDB_OPTS="windowSize=20 arrayDepth=300 hashDepth=300 dumpDepth=100" perl -MPerlIO='"'"'via;$DB::deep=9999'"'"' -d $@'
alias repl='PERLDB_OPTS="windowSize=20 arrayDepth=300 hashDepth=300 dumpDepth=5" perl \
-MPerlIO='"'"'via;$DB::deep=9999; \
'"'"' -de0'

Highlighter

This simple function can be piped after an output command and hight a specific keyword. Useful when you are looking for a specific text but want to see the whole context in addition to the keyword:


#highlight the given text in the input (cat log.txt | highlight ERROR)
highlight () {
perl -pe "s/$1/\e[1;31;21m$&\e[0m/g"
}

vim

Open the last output using vim:

You can use fc -s command to re-execute last command and feed it’s output to vim to open it. This might be useful in cases where you are looking for a file and then want to open it

alias viml='vim $(fc -s)'

Single letter aliases

Due to frequent usage, I have these 3 single-letter aliases:


alias g='git'

alias k='kubectl'

alias d='docker'

Code snippet: How to manually validate a Bitcoin block?

The other day, I was trying to understand how Bitcoin works. There are a lot of references that tell you some general description of the cryptocurrency, but a few of them get to the real details about the internals.

The blog post on [1] was really helpful, although it did not act as I wanted. After a little searching here and there, finally, I managed to prepare a Python code snippet you can use to manually make sure a Bitcoin block is valid.

This code represents the underlying design of a Bitcoin mining algorithm. Miners try with different arguments (most importantly, nonce) to come up with a hash result which starts with enough zeros.

Here is how you can use this code:

You need to provide six inputs to the code:

  1. version: The version number (Normally 2 but can differ in some cases)
  2. prev_block: The hash of the previous block
  3. merkle_root: A hash of the Merkle root
  4. date_time: Time in which the block is mined
  5. bits: A parameter related to difficulty level
  6. nonceA random number generated as a part of mining the block

You can find these numbers easily in Blockchain.info website for each block. If you replace their values inside below code, the result which is printed out should be equal to “Hash” parameter there.

This code snippet shows a block’s header is hashed to produce a hash value which if has a specific structure, will indicate the block is valid.

#Current values here are related to block number 474650
import hashlib, struct
import time

def reverse(a):
    return "".join(reversed([a[i:i+2] for i in range(0, len(a), 2)]))

version = "00000020"
prev_block = "0000000000000000012d11c46a420474875d0e3cfcdba19aac18df597fbb6d21"
merkle_root = "2f4f3f91ce5ffa9edcf728f9574f1e5ee1f97fe6142ee612da1db14f35f0d2db"
date_time = '07.07.2017 15:22:06' 
bits = 402754864
nonce = 999740600

pattern = '%d.%m.%Y %H:%M:%S'
epoch = int(time.mktime(time.strptime(date_time, pattern)))

header_hex = (version +
reverse(prev_block) +
reverse(merkle_root)+
reverse(hex(epoch).split('x')[-1]) +
reverse(hex(bits).split('x')[-1]) +
reverse(hex(nonce).split('x')[-1]))

header_bin = header_hex.decode('hex')
hash = hashlib.sha256(hashlib.sha256(header_bin).digest()).digest()

print hash[::-1].encode('hex_codec')

[1] http://www.righto.com/2014/02/bitcoin-mining-hard-way-algorithms.html

 

Build your own website: Hugo + Github Pages + Namecheap

 

Did you know that you can have your own website without paying for hosting?

Ok, the purpose of this post is not promoting a cost-saving method, but to show you how you can setup your website using a static content generator (Hugo) and hosting in on GitHub.com connected to your own domain. So you don’t need to pay for and manage a hosting service.

What you need for this:

  1. A registered domain name (You can still use this with a sub-domain on github.io website, but the process will be a bit different)
  2. An account in GitHub.com website
  3. A terminal application (Like iTerm on Mac or the command line on Windows)

Here is the list of steps you need to take:

  1. Setup DNS for your domain
  2. Setup your GitHub.com account
  3. Download Hugo on your computer and write something
  4. Update your GitHub.com account with the new content

And now the details!

Setup DNS

Details of this step depend on your domain registrar. For me, it is NameCheap.

You need to enter the address of GitHub.com as the hosting system for your domain. So if anyone types your domain name into their browser, they will be redirected to GitHub to show content.

First, you need to set “NAMESERVERS” option in the domain management page to “Namecheap BasicDNS”.

Screen Shot 2017-06-29 at 04.43.47

After that you can go to “Advanced DNS” page and update “HOST RECORDS” like below:

Screen Shot 2017-06-29 at 04.44.18

Here the numbers are specific to GitHub, so you will need to enter them exactly like this. “mahdix” is my username on the GitHub website. You will need to replace this with your own username.

Set up GitHub

After setting up your DNS records, you need to tell GitHub about your domain. So when it is referred to about a domain name, it knows what to show.

Go to your GitHub.com account, add a new repository by clicking “+” button on the top right of the screen, and selecting “New repository”. It’s better if you name this new repository same as your domain name but you can choose whatever name you like.  Make sure “Initialise this repository with a README” is checked, so your repository will be created with some basic content.

Screen Shot 2017-06-29 at 04.47.32

Get Hugo and write

First, you will need to open your console application. Make sure `git` is installed on your system. If it is not, follow instructions here to do the installation.

In the console, run `git clone` command to fetch a copy of your newly created repository.

Then you will need to run Hugo and write some content. You can follow instructions here as a starting guide.

When you are finished writing some content, you will need to push your changes to GitHub. This step informs GitHub about the new content you have just generated. This is done via `git push origin` command. Note that you may need to set up SSH keys if you are doing this for the first time (explained here).

Finish GitHub set up

Now, you need to tell GitHub about your domain name. Go to your GitHub repository, and go to “Settings” tab. There is a section called “Custom Domain”. You will need to enter your domain name there:

Screen Shot 2017-06-29 at 04.56.25

After you have done this step, your new content will be visible in a few minutes (This takes a bit longer only for the first time). Note that you only need to do this step once.

How to automate the publish process

There is a way to use some online services to automate the Hugo part. So you only need to make changes on GitHub website interface (write content, create new pages, …) and as soon as you submit your changes, they will be published for you. I will explain this in a follow-up post.

 

Must-have vim plugins

The other day I was cleaning up my .vimrc file and thought about writing a post about plugins that I use and the reason.

I believe in simplicity, especially in my work environment. That’s why I always try to customize as little as possible and try to use existing features. During past couple of years since I started using vim, I have found below plugins really useful.

Vundle

This is the most important plugin that I use. It helps my install other plugins more easily and conveniently.

You just need to start your `.vimrc` file with specific format and then for each plugin you want to have, add a line in this format:

Plugin 'tpope/vim-commentary'

The text inside single quote is the GitHub address of the repository of the plugin (except github.com).

But I think having so few plugins, I may be able to get rid of this one in future.

vim-commentary

This is the plugin which helps me comment/uncomment a line of a block of text. You simply need to use `gc` and the currently selected block will be commented (or commented out if it’s already commented). `gcc` does this for the current line.

You can combine this command with text movement verbs (e.g. `gc10j` to comment current and next 10 lines).

The good thing about this plugin is that it detects the correct commenting syntax so it doesn’t matter if you are working on a bash script or Python source code file. Just press the shortcut and this plugin will handle the rest.

fzf

I am always editing files and switching between buffers in vim. Unfortunately, standard features of vim are never enough for me (I wonder why they don’t provide better options built-in). That’s why I use this excellent plugin.

It uses `fzf` command line utility as a back-end and lets you search for a buffer, file, MRU or text in current or all buffers. It is a pretty handy tool for me.

Plugins that I used before but not now

supertab

This is a great plugin which lets you use <TAB> key in insert mode to invoke autocomplete feature of vim. But I find current shortcuts enough for my job, especially considering the fact that, due to vim being a text editor, I cannot expect any sophisticated autocomplete which makes this feature less useful for me.

CtrlP

This was the plugin I was using before fzf. It’s a very powerful plugin and lets you search for buffer, file, text and some other handy features.

The reason I switched to fzf was that it uses fzf command line utility which can be handy sometimes for me when I am working inside the terminal.

mru

In a (failed) effort to replace CtrlP with a simpler alternative, I tried this one but gave up and returned to my favorite searcher plugin.

This is a useful plugin which helps search into most recently opened files and open them.

vim-buffergator

Another plugin which was supposed to replace fzf (CtrlP at that time) but did not succeed. It lets you search in current open buffers and switch to another buffer.

delimitMate

This plugin provides autocompletion for delimiters like single-quote, double-quote, braces, etc.

The reason I no longer use this plugin is that during my edits, there are lots of times that I need to insert a single delimiter, but this plugin comes into my way and forces insertion of an extra delimiter which I have to delete. Sometimes this can be frustrating, although some other times it can be handy.

vim-operator-highlight

This plugin highlights all operators (`()+-*/` and other similar operators) in your file. This can help you read a source code file more easily but my problem with this plugin was that it also highlights comment section of the code which makes reading comments annoying. Also having a lot of different colors in your text editor can sometimes be confusing.

syntastic

This plugin checks the syntax of the current source code file and lets you know the error messages (if any). It highlights error lines and lets you easily navigate between errors. Personally, I prefer to use a terminal to check/compile my source code. And also having experience in working with dynamic typing languages, there is not much use for syntax checking as almost everything is evaluated at runtime (which I don’t like but that’s another story).

My first experiment writing code in Scala

(This post was written on Nov 17, 2016. For some reason I forgot to publish it at that time).

For the past few days, I have been assigned a small programming task which should be done in Scala. This is part of a hiring process and I have been asked to implement the server side of a tic-tac-toe game in Scala providing a REST API.

For the past few days, I have been assigned a small programming task which should be done in Scala. Now, this is the first time I am writing code in Scala language ever. So although I am not a professional Scala developer yet, I thought I should write down about my experience here.

The Good

I really like the language. It is simple and straight-forward. The principle of least astonishment was obeyed and I had very little surprises when I wanted to see how something is done, which is really good.

One of the positive points was the fact that there is no separation between primitive and Object-based data types. So we have “Int” and not “int” + “Integer” (Like Java). This helps a lot.

I also liked the way objects are defined: Like a function which accepts a set of inputs:

class Game(val player1: String, val player2: String) { ... }

Also the `var` and `val` separation is really encouraging the developer to make the best decision about the immutability of the data he is working with. It makes you think in a new way.

In Scala, you can use Java libraries easily without any pain. That gives you access to thousands of valuable frameworks and modules.

The Bad

Well, this section is mostly personal preference.

In Scala (unlike C and Java and similar languages), you should obey “variable name : type” rule which is not normal for me, considering the fact that I have worked for many years in C-like languages which have: “type variable_name” notation.

The “=” notation when defining a function which is returning a value, does not seem natural to me. Usually, you use “=” when you want to assign a value to a variable but the functions are not variables!

Same for the “<-” notation which is used to iterate over a collection.

Also I noticed there is no “break/continue” keywords in Scala. Well, it is possible to “simulate” them using a set of built-in keywords and exceptions, but seriously! exceptions are not invented to break out of a for loop! You should not use them to control program execution.

In defense of deterministic password managers

tldr; Although they have some shortcomings, deterministic password management is a great idea with a lot of benefits!

Today I read this post which was trying to convince people to switch to stateful password managers which store all passwords in some central data storage locally or on the cloud. I have to admit that the article had some valid points, but generally, it did poorly to convince me that deterministic password managers are bad.

First let me explain what a non-deterministic password manager is.

Normally, when you register in a website, you use a password manager to keep track of all of your passwords. So after signing up for a new email account (e.g. mahdix@gmail.com), you will store email id and it’s password (which is chosen by you), in the password manager’s database.

password-manager
How password manager works? [Image taken from Zoho.com website]
Obviously in this scheme, the central database becomes the most important component which needs extra protection. Where will this database be stored? Either in the service provider’s servers which is out of your control, or locally on your laptop or mobile phone. So if there is a data breach or your mobile phone is infected with a virus, your passwords can be in wrong hands! Of course there are a number of security measures to protect confidentiality of these data but as long as you have no control over them, there is a risk.

What is the solution? There has recently been a new way to manage password which I call deterministic password management (or DPM for short). This means that there is no central password database. Everything is inside your head! There is no need to have a cloud-based server to store your passwords. Generally, you have a basic passphrase which I call the “master password”. All other passwords are generated uniquely using a formula based on “master password” and some other public information. As a result, you don’t need to memorize 100 passwords or save them anywhere! Just memorize the master password and the reset can be automated.

For example for website “gmail.com” and username “mahdix” the password will be calculated using some formula like this:

password = Hash(master_password, 'gmail.com', 'mahdix')

password-managers-no-storage

The most important advantage of this scheme is that there is no need to store sensitive data anywhere! I would like to stress the term “sensitive” because later I will refer to this feature. The other advantage is that your password will all be strong passwords because they are generated using a hash function which causes the result to be semi-random. There will no longer be passwords like “letmein” or “admin” in this scheme.

What are flaws of Deterministic Password Managers?

Now, back to the initial point! The blog post discusses some reasons why this type of password management is not secure and as the author claims: “has fatal flaws”.

There are four points discussed in the article:

  1. DPMs cannot accommodate varying password policies without keeping state: I disagree with this one. As long as the hash function is complex enough (which normally is), the generated password can be accoreding to every sane password policy. The examples that author provides: “password must be at least 8 characters long” or “password must be lowercase only” either are by default satisfied or don’t make sense! I have never seen a website which insists password must be lowercase.
  2. DPMs cannot handle recovation of exposed passwords without keeping state: I agree with this one but see point below.
  3. DPMs cannot store existing secrets: Agreed but it can be fixed without loosing security. See the point below.
  4. Exposed master password alone exposes all of your site passwords: I disagree with this point because it is not an issue of DPMs per-se. If your email account’s password is exposed, it will probably have similar effect! That is how email works. This is the natural side-effect of using email.

State vs Sensitive data

I would like to differentiate between state of a password manager with sensitive data (e.g. passwords). Of course for a traditional password manager like 1Password, these are the same: State of the password manager is the collection of stored password. So it is sensitive data by definition.

But for DPMs this is not the case. You can have some basic and minimal state which is not sensitive. What does that mean? It means that this state can be publicly exposed and available without loosing security of any of your accounts. The only important security feature of state in this sense is “Integrity”. They should not be altered without your consent. If this feature is satisfied then there is no problem for a DPM to have publicly available state.

For example you can store this state in your Google Drive or Dropbox or even on GitHub! Even if you want to maintain higher level of security you can encrypt those information using the “master password”, but that won’t be required.

Now, how can using state address the two problems stated above: Password revocation and storing existing secrets?

How to store existing secrets in a DPM?

Suppose that you already have an account with password: EP. Now you want to switch to DPM. Can you keep this password? It is possible using State.

Suppose that DLM expects the password of this account to be CP where CP is result of the Hash function formula of the system. It is enough to store “XP=CP^EP” in the State storage of the system (^ represents XOR operation). Now, it is impossible to retain any information about “EP” with having only “XP“. But user can easily calculate “EP” by calculating “CP” and then: “EP=CP^XP” (You can use any other function with similar property).

Using this method, a user can store his existing password without revealing any sensitive information. Same approach can be used to renew a password. Just keep a public parameter (like counter), in the State data. It won’t expose anything sensitive and can be used to renew the password. Just increment the counter and the password will be:

Password = Hash(master_password, website, user_id, counter)

This can easily be incorporated to any DLM and provide features to re-new password or add pre-existing passwords.

Of course there will always be a trade-off . So for above mentioned advantages, there are some drawbacks too. Most notably, if your master password is exposed or stolen, all of your accounts will be at risk. This is the natural consequence of using “Master Password”.

Why do I prefer statically typed programming languages?

tldr;  Dynamic typing sucks!

I have been working with a dynamic programming language, Perl, for the past 1.5 years. I also have worked with Java and C# languages during my career and at this point, I can really say that static typing makes the life of developers much easier.

So let me first explain the difference between these two terms and then I will talk about my reasons.

Dynamic Typing

Dynamically typed programming languages, are languages where the type of the data is not specified in the source code, but when the program is being executed. So the developer is not responsible for thinking about whether some variable should hold a number, string or an instance of DateTime object. He just uses a variable name. When the code is being executed, the runtime environment (interpreter, JIT compiler, …) will deduce the type which is most appropriate for that variable.

For example, the developer can write something like this, in Perl:

my $birth_date = Some::Module::get_birth_date();
$birth_date->plus_time_interval('18y');
my $day_of_week = $birth_date->day_of_week;

In the above code, type of ‘$birth_date‘ and ‘$day_of_week‘ are not explicitly specified. When someone reads this piece of code, it clearly implies that `$day_of_week` should be an integer number, but in a large codebase, this is not always possible when you look at the code. So, for example, you really don’t know the type of `$birth_date` unless you read the source code of the `get_birth_date` function, or you can’t know if `plus_time_interval‘ expects an integer or string or something else unless you have used that function before.

Now assume someone is responsible for maintaining or fixing a bug in a 100K SLOC codebase. When investigating the flow of execution of the code, it sometimes becomes really hard and complicated to find out the real type of a variable.

Examples of dynamically typed programming languages are: Perl, Python and Ruby and Javascript.

Static typing

On the other hand, in statically typed programming languages, the developer is responsible to explicitly state the expected type of all of the variables or function outputs. This will limit the scope of valid operations on a variable and help the compiler (or possibly the interpreter) to check for invalid operations. For example, when the developer defines an integer variable, he cannot treat that variable as a string or assign a string literal to it because the compiler will catch such errors. An example of statically typed code:

int x = 12;
int y = x+y;
Customer c = find_customer("id", y);
c.name = "Mahdi";
c.save();

In the above code, type of the variables `x`, `y` and the input/output of the function `find_customer` are all specified statically within the source code. You cannot pass an integer parameter as the first argument of `find_customer` and compile the code because you will get a compiler error.

So writing a statically typed code seems harder because the developer should write more code and keep track of variable types in his mind when writing code. Right? Yes! That is right if and only if you want to write a few lines of code and keep that for a relatively short period of time and ignore all of that afterward. But if you want to keep a relatively large code base over a longer period of time (which is the case most of the time, in thr software industry) this burden proves to have some benefits.

But if you want to keep a relatively large code base over a longer period of time (which is the case most of the time, in the software industry) this burden proves to have some benefits.

Examples of statically typed programming languages include C, C++, Java and C#.

Why static typing?

As I pointed out in the previous section, unless you want to write a few lines of code, keep it for a short period of time and discard everything afterward, the statically typed programming language have a clear advantage over dynamically typed ones.

The point is, the source code of a software is written once, but it will be read a lot of times (for review, bug fix, refactoring, migrations, optimizations, …). So the basic idea in static typing is that you write some more code and take care of some more things in the code when writing the code, and in exchange, you (or others) will have a much easier life when they want to read it. Another clear benefit is that compiler can help you find out lots of possible errors before releasing your code to the production environment.

I have had numerous cases of “Can't locate object method 'xyz' via package "ABC::DEF"` error in production server (Because in the code, someone was trying to call a method on a variable which was expected to be of type T, but at runtime, it had another type). That is because of we, human beings, make mistakes. And computers are there to help us prevent those mistakes. If we just ignore their help (And use dynamic typing), then this is what happens next 🙂

But don’t get me wrong. There are cases where dynamic typing is the preferred choice when writing a software system. Generally speaking, if the code base size is relatively small (say 1-10K SLOC) with a small development team (~5 people), then dynamic typing can have benefits. Because obviously developers can write the code much faster and time-to-market will be much shorter. And this is what normally happens for startups where founders want to deliver the product/prototype as soon as possible.

I have seen some people say that we can achieve same advantages of static typing in a dynamically typed programming language by writing good tests. This is possible in theory but if you implement all of the required test cases for this purpose, then you will be doing what the computer is supposed to do! Manually doing something which can be automated. And by the way, the result code based will become much larger because of all those tests! which eliminates one of the important advantages of dynamic typing: more concise code with less boilerplate.

What is wrong with Object Oriented Programming and what is right with Functional Programming?

I have been reading numerous articles about OOP and FP and comparison of these two recently. There are lots of differences and pro/cons for each of them but one important and interesting thing I recently found out was the fact that how tangled behavior and data are in OOP. I have been writing OO code for years, but I had not found out this until I read about FP and the way it handles data/behavior.

When you write a software in an OO approach, you must relate every behavior to a single isolated data entity (or class) in the business domain of the application. For example, when writing a code to send email, you have a data entity called “Email Message” and a behavior “Send a message”. There is no way for you to define these two separately. By separately I mean being able to invoke “Send Email Message” functionality without an “Email Message” object. Of course, you need a message when you want to send one, but this is a difference in the way we organize and approach things in OOP. When writing software in OO approach, you MUST define each behavior under one and exactly one class.

Question is, what if we cannot fit a behavior in this pattern? What if it doesn’t make sense or it is confusing to attach a behavior to one and only one class?

Considering above example about sending an email message, what happens if we have other entities in the application like Sender, Recipient, and Storage. Then where should I put “Send email message” functionality? Each one of the below candidates can be an acceptable place for putting this behavior.

  1. You can put “Send” method in “Message” class: msg1.Send();
  2. It can also be placed inside Recipient class: recipient.ReceiveMessage(msg1);
  3. It can also be part of Sender class: sender1.SendMessage(msg1);
  4. It can be part of a separate class. For example MessageBroker: MessageBroker.getInstace().SendMessage(msg1);

This is really confusing and IMHO unneeded complexity.

The advantage of Functional Programming approach is that you don’t have to bind each behavior to one class. They are completely separated. So you can define and use them separately. This type of modeling is more consistent with the way we think about software world concepts.

Of course for things which exist in the physical world, this is less of an issue (A car is the only object which can do the “Start” functionality). But for more abstract concepts which can only be found in the software world, I think FP approach makes more sense.

 

What is Apache Maven and why do we need it?

I was going to write a blog post about Spring framework but I thought maybe it’s better to start with Maven as the first step.

With an increase in the number of Java libraries and frameworks and projects, there was a need for a standard way of describing how a piece of software should be built and what are it’s dependencies. Normally a Java project is using a lot of frameworks and tools (e.g. Logging, Database, Encryption, Math, …). We need a standard (machine-readable) way of describing these dependencies and how to compile and build the project.

Using Maven enforces a specific directory structure to your project. There is a specific directory path for your source code, runtime resource files, … Another change in your project is adding a file named pom.xml which will describe dependencies and build options.

I will explain some basic concepts in the Maven world, then will proceed to create a new project based on Maven.

Artifact

An Artifact, is a JAR file which is available for use in a Java project. Artifacts can be downloaded from Maven Repository (a publicly available website), but normally this is done automatically by maven command line.

Each Artifact represents a small piece of software that can be re-used in a software project. For example, there are artifacts for logging and encryption. There are a number of Maven repositories available, but the well-known repository is the one provided by Maven itself which is available for search and browse at http://search.maven.org/. You can search for artifacts and download them. You also have a local repository which contains JAR files downloaded from remote repositories or installed locally.

Each artifact is uniquely identified by using 3 elements: group-id, artifact-id and version.

Group-id represents the name of the company or organization which has produced the artifact (e.g. org.apache.logging.log4j). Artifact-id is a unique string for the artifact (e.g. log4j) and version string (e.g. 1.2.17). When you specify your project’s dependencies, you have to write their Group-id, artifact-id and version.

You can also tell Maven to produce an artifact when compiling your project. The result will be a .jar file including metadata about the artifact. You can then install this jar file into a Maven repository and use it as a dependency in other projects.

Build lifecycle

A build lifecyle is a set of steps (phases) which Maven does to build the project. You can define your own lifecycle but the default lifecycle is used most of the time. Here is a list of most important phases in the default lifecycle:

  1. validate: Make sure project structure is correct and all required data are available
  2. compile: Compile source code files
  3. test: Run unit test files
  4. package: Package compiled files in a distributable format (e.g. JAR)
  5. install: Install package into the local repository.

There is also another important lifecycle named “clean“. This lifecycle will clean-up artifacts created during build lifecycle.

You can call Maven command like tool and give the name of the phase or lifecycle to execute:

mvn clean package

Above command will run clean lifecycle and then will run build lifecycle up to ‘package‘ step. The target of the operation is the current directory.

Directory structure

Maven expects a standard project structure in your project. You have to follow this structure:

Suppose the root directory of the project is named ‘myapp‘:

  • myapp/pom.xml
  • myapp/src/main/java/com/mahdix/App.java: Source code files should reside in this location and below, according to your package names.
  • myapp/src/test/java/com/mahdix/Test.java: Location for test files

You will run Maven command line utility on the root directory of the project where there is pom.xml file.

POM.xml

Here is the general structure of a pom.xml file:

<project xmlns="http://maven.apache.org/POM/4.0.0">
 <modelVersion>4.0.0</modelVersion>
 
 <groupId>com.mahdix</groupId>
 <artifactId>my-app</artifactId>
 <version>1.0</version>
 <packaging>jar</packaging>
 
 <name>Test Application</name>
 <url>http://www.mahdix.com</url>
 
 <dependencies>
   <dependency>
     <groupId>junit</groupId>
     <artifactId>junit</artifactId>
     <version>4.8.2</version>
   </dependency>
</dependencies>
</project>

Here is explanation of important parts:

  • modelVersion: Specifies the model version of the Maven tool we use, this should always be 4.0.0
  • groupId, artifactId, version: These tags specify information about the current project
  • dependencies: This tag lists dependencies of the project. Maven will automatically fetch, download and install required JAR files according to the dependency list provided. You just need to specify artifacts that you need.