IBM Education Assistance for z/OS V2R1 Item: ASCII Unicode Option

IBM Education Assistance for z/OS V2R1
Item: ASCII Unicode Option
Element/Component: UNIX Shells and Utilities (S&U)
Material is current as of June 2013
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Agenda
■
Trademarks
■
Presentation Objectives
■
Overview
■
Usage & Invocation
■
Migration & Coexistence Considerations
■
Presentation Summary
■
Appendix
Page 2 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Trademarks
■
See url http://www.ibm.com/legal/copytrade.shtml for a list of trademarks.
Page 3 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full
Presentation Objectives
■
Introduce the features and benefits of the new z/OS UNIX Shells and
Utilities (S&U) support for working with ASCII/Unicode files.
Page 4 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Overview
■
Problem Statement
– As a z/OS UNIX Shells & Utilities user, I want the ability to control the
text conversion of input files used by the S&U commands.
– As a z/OS UNIX Shells & Utilities user, I want the ability to run tagged
shell scripts (tcsh scripts and SBCS sh scripts) under different SBCS
locales.
■
Solution
– Add –W filecodeset=codeset,pgmcodeset=codeset option on several
S&U commands to enable text conversion – consistent with support
added to vi and ex in V1R13.
– Add –B option on several S&U commands to disable automatic text
conversion – consistent with other commands that already have this
override support.
– Add new _TEXT_CONV environment variable to enable or disable text
conversion.
Page 5 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Overview
■
Solution (continued)
– With automatic conversion enabled, tagged shell scripts (tcsh scripts
and SBCS sh scripts) can be run under different SBCS locales.
Note : Tagged non-SBCS sh scripts (e.g DBCS, MBCS) are not
supported to run.
Benefits
–More detailed control of text conversion
• No file tagging required
• No environment or system setup required
–Easily override the system’s automatic text conversion
–Easily enable or disable text conversion for all S&U commands that
provide control of text conversion
–Easily run tagged shell scripts (tcsh scripts and SBCS sh scripts)
under SBCS locales
Page 6 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Usage & Invocation
■
–W filecodeset=codeset,pgmcodeset=codeset option was added to the
following commands:
cat
ed
head
unexpand
●
●
●
cmp
egrep
more
uniq
comm
expand
paste
wc
cut
fgrep
sed
diff
file
strings
dircmp
grep
tail
Consistent with support added to vi and ex in V1R13.
Option keywords are case sensitive.
Only supported values for pgmcodeset are IBM-1047 and 1047.
Page 7 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
–W filecodeset=codeset,pgmcodeset=codeset option details
–Performs text conversion from one code set to another when reading
from/writing to the file. For filecodeset, the coded character set of the
file is codeset. For pgmcodeset, the coded character set of the
program (command) is codeset.
–The filecodeset and pgmcodeset options can be used on files with any
file tag.
–If pgmcodeset is specified but filecodeset is omitted, then the default
file code set is ISO8859-1 even if the file is tagged with a different
code set. The default program code set is IBM-1047.
–When standard input (stdin) is used as an input text file, and stdin is
not associated with a terminal, the –W filecodeset and pgmcodeset
option will be applied to stdin.
Page 8 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
–B option was added to the following commands:
cat
egrep
sed
●
●
●
comm
expand
unexpand
cut
fgrep
uniq
diff
grep
wc
dircmp
more
ed
paste
Disables the automatic text conversion of tagged input files. This option
is ignored if the filecodeset or pgmcodeset options (–W option) are
specified.
When standard input (stdin) is used as an input text file, and stdin is not
associated with a terminal, –B will disable the automatic conversion of
stdin.
The head, strings, and tail commands were changed to disable
automatic conversion of stdin.
Page 9 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
Support for the new _TEXT_CONV environment variable was added to
the following commands:
cat
cmp
comm
cut
diff
dircmp
ed
egrep
ex
expand
fgrep
file
grep
head
more
pack
paste
sed
strings
tail
unexpand uniq
wc
vi
●
●
●
●
●
Contains text conversion information for the command.
Supported value keywords are FILECODESET, PGMCODESET, and
DISABLE (disable automatic conversion of tagged files).
Applies to all commands that support the filecodeset and pgmcodeset
option (-W option) and the -B option
_TEXT_CONV is ignored when the filecodeset or pgmcodeset
options (–W option) or the –B option are specified.
Command pack only support _TEXT_CONV=DISABLE
Page 10 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
_TEXT_CONV environment variable continued.
User beware! The user must understand that all commands that support
either the -W option or the -B option will perform the requested text
conversion (from FILECODESET to PGMCODESET, or DISABLE),
regardless of the file being used (since all automatic text conversion and
file tagging will be ignored).
Page 11 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
The -W option, -B option, and _TEXT_CONV environment variable will
only apply to the primary input text file(s) processed by the command.
■
Text conversion for files that are used by the command for reference
purposes (file lists, configuration, control information, etc.) will not be
affected by the -W option, -B option, or _TEXT_CONV.
■
Any output (standard output – stdout, or output files) produced by these
commands will not be affected by the new support. The only exception to
this would be output files that are the same as or associated with the
primary input files. For example, the editor commands (ex, vi, ed, sed
and more) exploit this exception.
Page 12 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
Note the following precedence rules:
–The –W filecodeset=codeset,pgmcodeset=codeset option overrides
the –B option, the _TEXT_CONV environment variable, and the
system’s automatic text conversion.
–The –B option overrides the _TEXT_CONV environment variable and
the system’s automatic text conversion.
–The _TEXT_CONV environment variable overrides the system’s
automatic text conversion. If the DISABLE value keyword is used
along with either the FILECODESET or PGMCODESET value
keywords, the DISABLE value keyword is ignored.
–If the –W filecodeset=codeset,pgmcodeset=codeset option, the –B
option, and the _TEXT_CONV environment variable aren’t specified,
then the system’s automatic text conversion rules apply.
Page 13 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
Example #1: To display the type of an untagged text file containing
ISO8859-1 characters issue:
file -W filecodeset=ISO8859-1,pgmcodeset=IBM-1047 myAsciiFile
■
Example #2: To display the <newline> count of a file containing EBCDIC
characters when automatic conversion has been enabled and the file is
incorrectly tagged as UTF-8:
wc -lB myMisTaggedFile
■
Example #3: To perform text conversion from the ASCII code set
ISO8859-1 to the EBCDIC code set IBM-1047 for all supported
commands issue:
export _TEXT_CONV=”FILECODESET(ISO8859-1),PGMCODESET(IBM-1047)”
Page 14 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
Shell script
shell scripts were limited by the following rule:
The code page in which a shell script is encoded must match the code
page of the locale in which it is run.
With this new support, shell scripts (tcsh scripts and SBCS sh scripts)
will be tagged and run correctly when automatic conversion is enabled and
the locale is SBCS.
Tagged non-SBCS sh scripts (e.g DBCS, MBCS) are not supported to
run.
■
Example #1: To run a sh script encoded with the ASCII characters under
the locale IBM-1047:
export _BPXK_AUTCVT=ALL
chtag -tc ISO8859-1 ascii.sh
ascii.sh
Page 15 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
Usage & Invocation
■
UNICODE conversion environments
User beware! The environments impact the conversion result.
_BPXK_UNICODE_TECHNIQUE=x (x=R,E,C,L,M,0-9) can override the
default conversion technique when Unicode Services is called. The default
Value is LMREC.
_BPXK_UNICODE_SUB=(YES|NO) indicates whether the Unicode
Services substitute character action is to be applied during translation.
_BPXK_UNICODE_MAL=(YES|NO) indicates whether the Unicode
Services mal-formed character action is to be applied during translation.
Page 16 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Migration & Coexistence Considerations
■
Several commands already supported the -B option. 3 commands did not
disable autoconversion of tagged files for standard input.
– head -B < myTaggedFile
– strings -B < myTaggedFile
– tail -B < myTaggedFile
■
The head, strings, and tail commands were changed to support the -B
option for standard input.
Page 17 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Presentation Summary
■
Several additional S&U commands now provide more detailed control of
text conversion to assist S&U users when working with ASCII/Unicode
files.
■
Shell script (tcsh scripts and SBCS sh scripts) can be tagged and run
when automatic conversion is enabled and the locale is SBCS.
Page 18 of 19
© 2013 IBM Corporation
Filename: zOS V2R1 USS S&U ASCII Unicode Option
IBM Presentation Template Full Version
Appendix
■
See z/OS V2R1 “UNIX System Services Command Reference” for the
S&U command updates (SA23-2280).
■
See Appendix “Controlling text conversion for z/OS UNIX shell
commands” in the z/OS V2R1 “UNIX System Services Command
Reference” for details on controlling text conversion for S&U.
■
See z/OS V2R1 “Unicode Services User's Guide and Reference” for the
details of Unicode data conversion(SA38-0680).
Page 19 of 19
© 2013 IBM Corporation