Monthly Archives: January 2014

Useful Linux Commands 01/2014

a) Bulk rename on the commandline. I needed this one to re-import bulk files for a BI database. All already processed files get a prefix ‘proc_’ in order to know, which files have already been imported into the BI database. Use http://tips.webdesign10.com/how-to-bulk-rename-files-in-linux-in-the-terminal

TEST YOUR EXPRESSION:
$ rename -n ‘s/^proc_order_list_(201312d{2}-d{6}-[A-Z]{3}-[A-Z]{2}).csv$/order_list_$1.csv/’ *.csv

proc_order_list_20131211-025914-AMZ-EU.csv renamed as order_list_20131211-025914-AMZ-EU.csv
proc_order_list_20131211-031130-ENG-DE.csv renamed as order_list_20131211-031130-ENG-DE.csv

DO THE ACTUAL RENAMING:
$ rename ‘s/^proc_order_list_(201312d{2}-d{6}-[A-Z]{3}-[A-Z]{2}).csv$/order_list_$1.csv/’ *.csv

There is a second level of ‘argument list too long’. If you touch it, you need a bash-script like this:

#!/bin/bash

find in/ -type f |
  while read old
  do
  	new=`echo $old | sed "s/proc_//g"`
   	if [ ! -f $new ]; then
	  echo $old '->' $new
	  mv $old $new 
	fi
  done

Or more selectively using a filename pattern:

#!/bin/bash

valid='proc_201403.*'

find in/ -type f |
  while read old
  do
        new=`echo $old | sed "s/proc_//g"`
        if [ ! -f $new ]; then
          if [ [ $old =~ $valid ] ]; then
            echo $old '->' $new
            mv $old $new
          #else
            #echo 'not matched' valid
          fi
        fi
  done

b) Output results from SQL query to file, the quick way – in case you have been using phpMyAdmin for this ;):
$ mysql -u -p -D -e “SELECT …” > quick_result.txt

c) Find directories with count of subdirectories or files (had to use this in order to find cluttered directories that caused problems with a server having software RAID and rsync backups):

$ find . -type d | cut -d/ -f 2 | uniq -c | sort -g

d) Prevent cp/mv/rm – cannot execute [Argument list too long] using find to copy a long list of files when you have long filelists in directories from which you would like to copy/move/remove:
$ cd /var/www/project/incoming/staging;
$ find ../production/data/sources/orderstatus/in/ -name ‘*.xml’ -exec cp {} data/sources/orderstatus/in/ ;

e) Copy all files from multiple backup sub-directories (structure like this 1122/3344/112233445566.xml) into ONE directory:
$ find ./dumpdirs/11* -name “*.xml” -type f -exec cp {} ./flatToFolder ;

f) Count all files in subdirectories with the pattern proc.*.xml:
$ find in/ -name “proc_*.xml” | wc -l

g) Filelist too long using tar and wildcards, use a filelist:
$ find in/ -name ‘*.xml’ -print > tarfile.list.txt
$ tar -cjvf evelopmentOrderstati-20140306.tar.bz2 in/*.xml
$ rm tarfile.list.txt

h) Filelist too long using grep:
Problem:
$ grep -r “4384940″ *
-bash: /bin/grep: Argument list too long
Too many files in your directory

Check:
$ ls -1 | wc -l
256930

Solution:
$ find . -type f | xargs grep “4384940″

Another way to avoid this problem is to substitute the “*” with a “.”:
$ grep -r “4384940″ .