|
double de-compression? you're doing it wrong. the proper [as per design] procedure:
1. un-gzip it and redirect the output to the tape device,
... (may necessitate co-ordinated use of 2 hands if tape device doesn't have automatic start/stop control)
2. rewind the tape
... (on some sites a separate rewinding machine - submit both the tape and duly completed rewind request)
3. and then: un-tar (tar -x) it from the tape reader into the destination directory.
... (please remember: folders are where you keep your forms and notes, files are stored in directories.)
some new-fangled versions of tar have gzip built right in, many operators cant grok that.
kids these days: always looking for shortcuts.
This internet thing is amazing! Letting people use it: worst idea ever!
|
|
|
|
|
Files are not stored in directories. (Nor in folders of course.)
|
|
|
|
|
good point, the directory is just a list of name and inode number pairs.
Such a simple system, could elegantly do things back then that NTFS still hasn't come close to without a whole mess of complicated hoops to jump through.
This internet thing is amazing! Letting people use it: worst idea ever!
|
|
|
|
|
For first, you compare part of the file system to a file system. Second, NTFS is more inode than inode itself, as it stores the file data itself as attribute. Third, for Windows there is NTFS since 1993. In the Unix world since we have seen at least half dozen über weltmeister open source, free as in freedom file systems, every one leaving everything else in the dust. So much about extensibility and stability.
|
|
|
|
|
Tape? Tape? We don't use such new-fangled magnetics here!
Send it to the card punch!
Sent from my Amstrad PC 1640
Bad command or file name. Bad, bad command! Sit! Stay! Staaaay...
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
luxury! toggle switches!
This internet thing is amazing! Letting people use it: worst idea ever!
|
|
|
|
|
luxury! Solder bits of wire!
|
|
|
|
|
Don't make fun of us oldies!
Sure, it was back in the summer of 1978, before I started my studies, I got a summer job in a company where you had to flip switches, deposit, flip switches, deposit... I believe that mini bootstrap was 12 or 14 instructions long, enough to read in the short paper tape with the real bootstrap, so that we could mount the large reel of paper tape with, say, the compiler. For the system software, like the boot, it really wasn't paper, it was either mylar (I never met one who could tear off one of those mylar tapes with his bare hands), or plastic covered aluminum with so sharp edges that it could cut your throat if it got out of control. Some operators used leather gloves as a protection. But they never wore out, even if read a dozen times every day.
On the other hand: If you were lucky enough to need the same program the next morning that you used last the previous day, you didn't have to reload it: The machines had real core memory that retained its contents even if you turned off the power. (One of the guys scoffed semiconductor RAM, insisting that it was a fad: Computer industry will never accept memory that requires power to be constantly on to retain its contents!)
I even flipped switches at the University, but that was in a more specialized context: In one lab exercize we used a 2901 development kit. 2901 was a widely used 4 bit bit slice processor, that you could line up in twos for an 8 bit CPU, four for a 16 bit CPU, eight for a 32 bit CPU. It contained the hard logic, which was controlled by an external microcode RAM. For our lab, we had a 16 bit by 64 words RAM, and we wrote the microcode to read two 4 bit input values and place the sum on the output.
Really knowing what's going on, all the way down to the signal paths, is impossible with the systems of today. Even understanding the operating system thorougly, software only, is out of reach. I must admid that I sometimes long back to those days when everything could be understood, you had a feeling of mastering it all, you never left anything to automagic mechanisms you would simply have to trust in blind faith. The nearest you'll get today is in embedded programming, coding plain C on, say, an ARM M0. I did that for a few years, and it felt as if nostalgia had turned real
|
|
|
|
|
You need one of these new-fangled magnetic drum memories!
|
|
|
|
|
I must agree, kids these days have all these slang wordings on phones, but can't grok that tar stands for TApe aRchive, i's jst s bvius.
|
|
|
|
|
I wonder how much slang has been influenced by Unix (or *nix, if you prefer). Before *nix, slang made some degree of sense to me, but then we got these absurdities like 'less' for displaying av file (yes, I know its history!), GNU, and a thousand 'funny' but made-up and totally meaningless (in the way they are used) names and terms. I see more and more of that creeping into non-computer slang as well: terms with no etymological background related to the application, but with a completely unrelated meaning that is absurd in the context.
Controlling the development of a natural language makes herding cats look like a task for five year olds.
|
|
|
|
|
In plain English: A tar file is an (uncompressed) file(sssss) archive. i.e. a group of files together as one file.
GZip compress 1 file. The tar archive.
So you do in 2 steps is:
1. uncompress
2. unarchive
Now, it's about your tools. Some too unpack both steps as one.
Maybe use different tools if doing in 2 step is a bother?
|
|
|
|
|
Yes, what he said. You're probably used to ZIP files, which do both jobs in a single file format.
|
|
|
|
|
tar xvf file
Does both in one comnand
... such stuff as dreams are made on
|
|
|
|
|
megaadam wrote: tar xvf file
Memories, oh Memories!!!
I'd rather be phishing!
|
|
|
|
|
dingdingdingdingding!!!!!!
We have a winner!!!!
|
|
|
|
|
megaadam wrote:
tar xvf file
Does both in one comnand
Wrong - you need tar xvfz file
|
|
|
|
|
|
|
Simples.
Formats like ZIP employ compression on a file-by-file basis. This is obviously prone to poor rates of compression as compared to a scheme that can compress the entire contents of an archive, or in some cases, data that is contained within lots of small files.
The solution is to slap all of the files together first in a monolithic chunk. You then run compression on that chunk in the (almost always delivered) hope that you'll achieve a smaller output than if the compressed output of all the contained files was then glued together into a single chunk.
TAR - turn a bunch of files into one.
GZ - compress a file.
|
|
|
|
|
enhzflep wrote: Formats like ZIP employ compression on a file-by-file basis. This is obviously prone to poor rates of compression as compared to a scheme that can compress the entire contents of an archive If you experience significantly better compression by merging a lot of small files into one, either your average file size is extremely small (like in one classical Unix study showing that for the system as a whole, more than 80% of the files were less than 5 kbytes).
Or, you misinterpret data: it is not poorer compression, but more metadata, administrative information. One large file requires one descriptor, five thousand tiny files require five thousand descriptors. That is not poorer rate of compression, but similar to gathering the five thousand files into one even without compression: That would save the space of 4999 inodes, as well as the internal fragmentation loss - if file sizes are evenly distributed: half an allocation unit (/disk block) per file. You save space by making this huge file, but it has nothing to do with data compression.
If you want to make an exact comparison, you cannot compare the size of the .tar file to the size of the .tar.gz file. That sure would give you the compression rate of the .tar file, but to created the .tar file you had to add a noticable amount of metadata. So what you save by having only one file/compression descriptor, you partially loose to .tar administrative information.
I keep a number of 'archives' of many small files in .zip format, saving space due to the compression, of course, but also a lot is saved by not wasting 2 Kbyte on each file in internal fragmentation.
Another advantage of .zipping up these file groups: I frequently move the files between machines on USB sticks. Writing a few thousand files to a USB stick takes a lot of time to create the files. I guess that it has to do with USB stick writes not being cached, at least not to the same degree, and file creation requires lots of writes, even if the file contents is done in one single write. Writing a single .zip archive to a USB stick is several times faster than writing two thousand tiny files.
A similar situation: We run a fairly large build system, with about a hundred build agents. A build may be producing dozens, in some cases hundreds, of individual artifacts. On the central server, distributing these artifacts, the inode table exploded when each artifact was treated separately. We were forced to modify the builds to pack up related files into archives (usually a single one) to be saved centrally as an artifact of the build.
Most of these advantages comes from the archive file, whether compressed or not. Compression comes as an additional benefit.
When you use .tar.gz as a distribution format, having to untar and ungzip the entire collection is perfectly fine. When using a .zip file as an (often mostly or fully read-only) 'working' archive, extracting a single file quickly is essential. For my use, .tar.gz would be very cumbersome. Also, having the file system retrieving zipped files for applications that do not have unzipping built into the code is great. Of course: A self-explanatory user interface that doesn't require you to memorize a zillion of options and command words, can display the directory structure in the archive, and preview files, is also nice. The ability to encrypt files is valuable as well.
I haven't discovered any real disadvantage of .zip even as a distribution format, but for that purpose, .tar.gz is also fine. However, for daily work, I most certainly prefer a format that lets me access individual files in the archive without having to decrypt, untar and ungzip the entire archive.
|
|
|
|
|
What are you downloading? Most source code bundles are just compressed with the .gz extension.
Maybe if there were a bunch of videos, jpeg or pdf files, which have inbuilt compression, then the uncompressed .tar file might be as short, if not shorter than the tar.gz file. But muscle memory will automatically add the z option to tar to invoke gzip to compress the output. Similar, there's little point is adding the -C option to scp when transferring a .tar.gz file.
|
|
|
|
|
Because GZ refers to gzip - and TAR refers to Tape Archive - which is not compressed (though some Unixes do use "compress" to compress the files in the TAR archive as a default setting).
You can decompress with -zxvf to do the decompress all in the one step.
You probably also noticed that some have TGZ extensions, and some have TAR.GZ extensions - depending on the process used to create the archive.
In short - the reason is the Unix tradition for small programs that do one thing well and allow you to pipe them into one another.
|
|
|
|
|
A TAR file in your sense is indeed a TAR.GZ file, which embed two formats : TAR and GZ. Here's the process :
- A TAR file is ceated, concatening several files together in their uncompressed form ; note that resulting TAR file is uncompressed,
- A GZ file is created by compressing the previous TAR file.
So to decompress a TAR.GZ file, you have to :
- Decompress the compressed GZ file and
- "Untar" (unarchive) the uncompressed resulting TAR file.
Note that you can compress a TAR file with other popular compressors (bzip2 => TAR.BZ2, 7zip => TAR.7Z...).
modified 21-May-19 4:10am.
|
|
|
|
|
The decompression can be combined, but indeed the format packed twice as .tar.gz or `.tar.bz2` or such.
This is à la Unix where small operations are combined into on large operation.
Its advantage here: tar (tape archive) concatenates all files, and the ensuing gz compression can do a "far" better compression over all content. As opposed to .zip compression.
|
|
|
|
|