Files and Directories
How to find the last modification date of a file?
Users can use the stat
command to get the detailed status of a file. For example, if a user has a file called example.txt
, then the stat
command can be invoked as
$ stat example.txt
The output will be:
File: example.txt
Size: 0 Blocks: 1 IO Block: 4194304 regular empty file
Device: f96638d6h/4184226006d Inode: 144115339825778942 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ nischay) Gid: ( 1000/ nischay)
Access: 2020-08-18 23:21:06.000000000 +0530
Modify: 2020-08-18 23:21:06.000000000 +0530
Change: 2020-08-18 23:21:06.000000000 +0530
How to find the deletion date of a file stored in SCRATCH
?
Using date
The file deletion date of a file can be obtained using the command date
. For example, if we want to know the deletion date of the file example.txt
, please type the following command
$ date -d "2020-08-18 23:21:06.000000000 +0530 +15days"
Here, 2020-08-18 23:21:06.000000000 +0530
is the access time of the file, obtained using the stat
command as shown earlier. Since the files stores in $SCRATCH
are automatically deleted after 15 days from its last modification, we need to add +15days
to the access time. The output of the above command will be:
Wed Sep 2 23:21:06 IST 2020
Using usertools
Alternatively, users can use the usertools
command to get the access time and also the time left before deletion with one single command. Please note that usertools
is not a standard linux command but developed by us to obtain the status of files with ease.
$ usertools file access example.txt
Filename: example.txt
Access Time: 2020-08-18 23:21:06.100206
Time Left: 359 hours 59 minutes 52 seconds.
How to copy files or directories from my computer to the HPC cluster?
Users of Linux, Windows 10 (1803 and above) and macOS can copy files using scp
utility. The following commands illustrate copying a file example.txt
and a directory test
from a user's personal machine to the HPC facility.
$ scp example.txt <user>@login.hpc.bits-hyderabad.ac.in:
Here <user>
is the username
of the account to access the facility. Please note the : (colon) after the domain name. By default, if no path is specified after the colon, the file is copied into the $HOME
directory of the user. Similarly, to copy a directory, -r
flag is used along with scp
.
$ scp -r test <user>@login.hpc.bits-hyderabad.ac.in:
Alternatively, the above two commands can be merged into a single command.
$ scp -r example.txt test <user>@login.hpc.bits-hyderabad.ac.in:
Users of macOS
and Linux
can use the rsync
utility, which offers faster resumable transfers.
$ rsync -avz --progress test <user>@login.hpc.bits-hyderabad.ac.in:
Here,
-
-a
represents archival mode of transfer, which preserves metadata of the file. -
-v
represents verbose mode. -
-z
represents compression mode, which compresses the data during the transfer and thus improves the transfer rate. -
--progress
shows the progress of the transfer.
How to copy files or directories from HPC cluster to my computer?
Users can copy a file or directory from the HPC cluster to their machine by using the scp
command. For example, if a user wishes to transfer a file example.txt
and a directory test
stored in \home\<user>\result
directory, then the following commands can be used.
$ scp <user>@login.hpc.bits-hyderabad.ac.in:\home\<user>\result\example.txt .
$ scp -r <user>@login.hpc.bits-hyderabad.ac.in:\home\<user>\result\test .
Here <user>
is the username
of the account to access the facility. Please note the . (dot symbol) at the end of the command. The above commands copy example.txt
and test
into the current directory of the user's terminal on their local machine. For more information on scp
and rsync
, users can refer to the manpages
for the utilities. The manual pages of these commands can be accessed using man scp
and man rsync
.
How to access my $HOME directory?
Users can access their $HOME
directory by typing cd $HOME
in a terminal on Sharanga. Note that the home directory of a user is accessible only to that particular user and cannot be viewed or accessed by others in the cluster.
How to access my $SCRATCH space?
Users can access their $SCRATCH
space by typing cd $SCRATCH
in a terminal on Sharanga. Note that the scratch space of a user is accessible only to that particular user and cannot be viewed or accessed by others in the cluster.
SCP file transfer is very slow. Is it possible to speed it up?
Users are advised to use rsync
with -avz
flags for compressible and resumable file transfers.
What is a good strategy for managing my inode usage?
To manage inode usage, avoid creating a large number of small files. Instead, aggregate data into fewer, larger files where possible. Periodically review and remove unnecessary files or directories.
How do I check my inode usage?
To check your inode usage, run the following command:
# For /home use
$ lfs quota /home
# For /scratch use
$ lfs quota /scratch
Alternatively, you can check your inode usage for a specific directory by running the following command:
$ du --inode /path/to/directory
How do I remove a large number of files?
There are two methods to remove a large number of files that are much faster than using rm -rf
.
- Using RSync
Let us say that we want to delete a directory called
this_directory
. First, create an empty directory.
mkdir empty_dir
Next, run this rsync command to delete the directory:
rsync -aP --delete empty_dir/ this_directory/
This should delete the contents of this_directory
.
Lastly, run rm -r empty_directory
and rm -r this_directory
to remove both the empty directories.
2. Using Perl
If we want to delete a directory this_directory
with a large number of files, we first enter the directory with cd this_directory
, after which we can run the following command:
perl -e 'for(<*>){((stat)[9]<(unlink))}'
This will delete all the files inside the directory. Lastly, to delete the now-empty directory, run
cd ..; rm -r this_directory
Comparison of the two approaches
To delete a test batch of 10,000 files the RSync approach takes 0.136s of real time, while Perl takes 0.0169s.
In comparison, rm -rf
requires at least 0.140s of real time to delete the files, and it becomes considerably slower as the number of files increases.
The benefit of using RSync over Perl despite it being slower is that RSync shows progress for the operation.
Warning
Please exercise caution while deleting files, as it might not be possible to restore them once deleted. Make sure that the files within the directory being deleted are definitely unnecessary.
How do I selectively delete files?
For selectively deleting files, the find
command with the -delete
flag is very helpful.
For example, to delete all files with the extension .ext
in a directory, we can use
find . -name “*.ext” -delete
Please note that this search is recursive, and files with this extension in subdirectories will be deleted as well.
find
has a number of other ways to match files, including time of creation, regular expressions and permissions among other things.
You can find out more about the parameters of find
by running the command man find
.
Warning
Please exercise caution while deleting files, as it might not be possible to restore them once deleted. Make sure that the files within the directory being deleted are definitely unnecessary.