Uniq: Difference between revisions

From ICO wiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 7: Line 7:
16.11.2016
16.11.2016


=Kirjeldus=
==Description==
'''<code>uniq</code>''' - report or filter out repeated lines in a file.


'''<code>uniq</code>''' on Unix'i vahend, mis annab võimalusi väljendada või sorteerida korduvaid read juba sorteeritud failis. Korduvaid read ei erista kui nad pole pandud üksteise järel, seepärast on kasulik eelnevalt sorteerida faili.'''<code>uniq</code>''' eirab korduvaid elemente -d - kuvab ainult read, mis on korduvad (ühekordselt) -D - kuvab ainult read, mis on korduvalt (kõik esinemised).
The '''<code>uniq</code>''' utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file.  If input_file is a single dash (`-') or absent, the standard input is read. If output_file is absent, standard output is used for output.  The second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are not adjacent, so it may be necessary to sort the files first.


Artikkel annab lühiülevaate <code>uniq</code> kasutamisest ja toob mõned näited ning räägib vajaminevatest parameetritest (võtmetest).
==Options==
'''-c''', --count; - Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.<br/>
'''-d ''', --repeated ; - Only output lines that are repeated in the input.<br/>
'''-D ''', --all-repeated; <br/>
'''-f ''', --skip-fields=N ; - Ignore the first num fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adjacent fields by blanks. Field numbers are one based, i.e., the first field is field one.<br/>
'''-i ''', --ignore-case ; -  Case insensitive comparison of lines.<br/>
'''-s ''', --skip-chars=N ; - Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the first chars characters after the first num fields will be ignored. Character numbers are one based, i.e., the first character is character one.<br/>
'''-u ''', --unique ; -  Only output lines that are not repeated in the input.<br/>
'''-w ''', --check-chars=N ; - Compare no more than N characters in lines.<br/>


==Võtmed==
==Examples==
''-c '', --count; - näitab mitu korda rida kordatakse failis<br/>
Let's say we have an eight-line text file, myfile.txt, which contains the following text:
''-d '', --repeated ; - printida ainult korduvaid read<br/>
 
''-D '', --all-repeated; printida kõik duplikaat read<br/>
This is a line.<br/>
''-f '', --skip-fields=N ; - mitte võrrelda esimest N "fields"  <br/>
This is a line.<br/>
''-i '', --ignore-case ; -  ignoreerida erinevusi juhul kui võrreldakse teiste readega<br/>
This is a line.<br/>
''-s '', --skip-chars=N ; - mitte võrrelda N esiment tähti<br/>
''-u '', --unique ; -  prinditakse ainult ainulaadsed read<br/>
This is also a line.<br/>
''-w '', --check-chars=N ; - võrrelda mitte rohkem kui N tähti reas<br/>
This is also a line.<br/>
This is also also a line.<br/>
 
Here are several ways to run uniq on this file, and the output it creates:
  uniq myfile.txt
 
This is a line.
This is also a line.
This is also also a line.
  uniq -c myfile.txt
 
3 This is a line.<br/>
1  <br/>
2 This is also a line.<br/>
1  <br/>
1 This is also also a line.<br/>
 
  uniq -d myfile.txt
This is a line. <br/>
This is also a line. <br/>
  uniq -u myfile.txt
This is also also a line.
==Environment==
The LANG, LC_ALL, LC_COLLATE and LC_CTYPE environment variables affect the execution of '''<code>uniq</code>''' as described in [https://www.freebsd.org/cgi/man.cgi?query=environ&sektion=7&apropos=0&manpath=FreeBSD+10.3-RELEASE+and+Ports environ(7)].
 
==Exit status==
The '''<code>uniq</code>''' utility exits 0 on success, and >0 if an error occurs.
 
==Compatibility==
The historic +number and -number options have been deprecated but are still supported in this implementation.
 
==Standards==
The '''<code>uniq</code>''' utility conforms to IEEE Std 1003.1-2001 (``POSIX.1'') as amended by Cor. 1-2002.
 
==History==
A '''<code>uniq</code>''' command appeared in Version 3 AT&T UNIX.
 
==Notes==
'''<code>uniq</code>''' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use '''<code>sort -u</code>''' instead of '''<code>uniq</code>'''.
 
==Related commands==
[http://www.computerhope.com/unix/ucomm.htm comm] — Compare two sorted files line by line.<br/>
[http://www.computerhope.com/unix/upack.htm pack] — Compress files using a Huffman algorithm.<br/>
[http://www.computerhope.com/unix/upcat.htm pcat] — Print the uncompressed contents of a compressed file.<br/>
[http://www.computerhope.com/unix/usort.htm sort] — Sort the lines in a text file.<br/>
[http://www.computerhope.com/unix/uuncompr.htm uncompress] — Extract files from compressed archives.<br/>

Revision as of 19:52, 5 December 2016


Autor

Jevgeni Kuzmin, A21

16.11.2016

Description

uniq - report or filter out repeated lines in a file.

The uniq utility reads the specified input_file comparing adjacent lines, and writes a copy of each unique input line to the output_file. If input_file is a single dash (`-') or absent, the standard input is read. If output_file is absent, standard output is used for output. The second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are not adjacent, so it may be necessary to sort the files first.

Options

-c, --count; - Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.
-d , --repeated ; - Only output lines that are repeated in the input.
-D , --all-repeated;
-f , --skip-fields=N ; - Ignore the first num fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adjacent fields by blanks. Field numbers are one based, i.e., the first field is field one.
-i , --ignore-case ; - Case insensitive comparison of lines.
-s , --skip-chars=N ; - Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the first chars characters after the first num fields will be ignored. Character numbers are one based, i.e., the first character is character one.
-u , --unique ; - Only output lines that are not repeated in the input.
-w , --check-chars=N ; - Compare no more than N characters in lines.

Examples

Let's say we have an eight-line text file, myfile.txt, which contains the following text:

This is a line.
This is a line.
This is a line.

This is also a line.
This is also a line.

This is also also a line.

Here are several ways to run uniq on this file, and the output it creates:

  uniq myfile.txt

This is a line.

This is also a line.

This is also also a line.

  uniq -c myfile.txt

3 This is a line.
1
2 This is also a line.
1
1 This is also also a line.

  uniq -d myfile.txt

This is a line.
This is also a line.

  uniq -u myfile.txt

This is also also a line.

Environment

The LANG, LC_ALL, LC_COLLATE and LC_CTYPE environment variables affect the execution of uniq as described in environ(7).

Exit status

The uniq utility exits 0 on success, and >0 if an error occurs.

Compatibility

The historic +number and -number options have been deprecated but are still supported in this implementation.

Standards

The uniq utility conforms to IEEE Std 1003.1-2001 (``POSIX.1) as amended by Cor. 1-2002.

History

A uniq command appeared in Version 3 AT&T UNIX.

Notes

uniq does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use sort -u instead of uniq.

Related commands

comm — Compare two sorted files line by line.
pack — Compress files using a Huffman algorithm.
pcat — Print the uncompressed contents of a compressed file.
sort — Sort the lines in a text file.
uncompress — Extract files from compressed archives.