pmlogger_daily(1) — Linux manual page

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | CALLBACKS | CONFIGURATION | FILES | PCP ENVIRONMENT | COMPATIBILITY ISSUES | SEE ALSO | COLOPHON

PMLOGGER_DAILY(1)        General Commands Manual       PMLOGGER_DAILY(1)

NAME         top

       pmlogger_daily - administration of Performance Co-Pilot archive
       files

SYNOPSIS         top

       $PCP_BINADM_DIR/pmlogger_daily [-DEfKMNoprRVzZ?]  [-c control]
       [-k time] [-l logfile] [-m addresses] [-s size] [-t want] [-x
       time] [-X program] [-Y regex]

DESCRIPTION         top

       pmlogger_daily and the related pmlogger_check(1) tools along with
       associated control files (see pmlogger.control(5)) may be used to
       create a customized regime of administration and management for
       historical archives of performance data within the Performance
       Co-Pilot (see PCPIntro(1)) infrastructure.

       pmlogger_daily is intended to be run once per day, preferably in
       the early morning, as soon after midnight as practicable.  Its
       task is to aggregate, rotate and perform general housekeeping one
       or more sets of PCP archives.

       To accommodate the evolution of PMDAs and changes in production
       logging environments, pmlogger_daily is integrated with
       pmlogrewrite(1) to allow optional and automatic rewriting of
       archives before merging.  If there are global rewriting rules to
       be applied across all archives mentioned in the control file(s),
       then create the directory $PCP_SYSCONF_DIR/pmlogrewrite and place
       any pmlogrewrite(1) rewriting rules in this directory.  For
       rewriting rules that are specific to only one family of archives,
       use the directory name from the control file(s) - i.e. the fourth
       field - and create a file, or a directory, or a symbolic link
       named pmlogrewrite within this directory and place the required
       rewriting rule(s) in the pmlogrewrite file or in files within the
       pmlogrewrite subdirectory.  pmlogger_daily will choose rewriting
       rules from the archive directory if they exist, else rewriting
       rules from $PCP_SYSCONF_DIR/pmlogrewrite if that directory
       exists, else no rewriting is attempted.

       As an alternate mechanism, if the file
       $PCP_LOG_DIR/pmlogger/.NeedRewrite exists when pmlogger_daily
       starts then this is treated the same as specifying -R on the
       command line and $PCP_LOG_DIR/pmlogger/.NeedRewrite will be
       removed once all the rewriting has been done.

OPTIONS         top

       -c control, --control=control
            Both pmlogger_daily and pmlogger_check(1) are controlled by
            PCP logger control file(s) that specifies the pmlogger
            instances to be managed.  The default control file is
            $PCP_PMLOGGERCONTROL_PATH, but an alternate may be specified
            using the -c option.  If the directory
            $PCP_PMLOGGERCONTROL_PATH.d (or control.d from the -c
            option) exists, then the contents of any additional control
            files therein will be appended to the main control file
            (which must exist).

       -D, --noreport
            Do not perform the conditional pmlogger_daily_report(1)
            processing as described below.

       -E, --expunge
            This option causes pmlogger_daily to pass the -E flag to
            pmlogger_merge(1) in order to expunge metrics with metadata
            inconsistencies and continue rather than fail.  This is
            intended for automated daily archive rotation where it is
            highly desirable for unattended daily archive merging,
            rewriting and compression to succeed.  For further details,
            see pmlogger_merge(1) and description for the -x flag in
            pmlogextract(1).

       -f, --force
            This option forces pmlogger_daily to attempt compression
            actions.  Using this option in production is not
            recommended.

       -k time, --discard=time
            After some period, old PCP archives are discarded.  time is
            a time specification in the syntax of find-filter(1), so
            DD[:HH[:MM]].  The optional HH (hours) and MM (minutes)
            parts are 0 if not specified.  By default the time is 14:0:0
            or 14 days, but may be changed using this option.

            Some special values are recognized for the time, namely 0 to
            keep no archives beyond the the ones being currently written
            by pmlogger(1), and forever or never to prevent any archives
            being discarded.

            The time can also be set using the $PCP_CULLAFTER variable,
            set in either the environment or in a control file.  If both
            $PCP_CULLAFTER and -k specify different values for time then
            the environment variable value is used and a warning is
            issued, i.e. if $PCP_CULLAFTER is set in the control file,
            it overrides -k given on the command line.

            Note that the semantics of time are that it is measured from
            the time of last modification of each archive, and not from
            the original archive creation date.  This has subtle
            implications for compression (see below) - the compression
            process results in the creation of new archive files which
            have new modification times.  In this case, the time period
            (re)starts from the time of compression.

       -K   When this option is specified for pmlogger_daily then only
            the compression tasks are attempted, so no pmlogger
            rotation, no culling, no rewriting, etc.  When -K is used
            and a period of 0 is in effect (from -x on the command line
            or $PCP_COMPRESSAFTER in the environment or via the control
            file) this is intended for environments where compression of
            archives is desired before the scheduled daily processing
            happens.  To achieve this, once pmlogger_check(1) has
            completed regular processing, it calls pmlogger_daily with
            just the -K option.  Provided $PCP_COMPRESSAFTER is set to 0
            along with any other required compression options to match
            the scheduled invocation of pmlogger_daily, then this will
            compress all volumes except the ones being currently written
            by pmlogger(1).  If $PCP_COMPRESSAFTER is set to a value
            greater than zero, then manually running pmlogger_daily with
            the -x option may be used to compress volumes that are
            younger than the $PCP_COMPRESSAFTER time.  This may be used
            to reclaim filesystem space by compressing volumes earlier
            than they would have otherwise been compressed.  Note that
            since the default value of $PCP_COMPRESSAFTER is 0 days, the
            -x option has no effect unless the control file has been
            edited and $PCP_COMPRESSAFTER has been set to a value
            greater than 0.

       -l file, --logfile=file
            In order to ensure that mail is not unintentionally sent
            when these scripts are run from cron(8) or systemd(1)
            diagnostics are always sent to log files.  By default, this
            file is $PCP_LOG_DIR/pmlogger/pmlogger_daily.log but this
            can be changed using the -l option.  If this log file
            already exists when the script starts, it will be renamed
            with a .prev suffix (overwriting any log file saved earlier)
            before diagnostics are generated to the log file.  The -l
            and -t options cannot be used together.

       -m addresses, --mail=addresses
            Use of this option causes pmlogger_daily to construct a
            summary of the ``notices'' file entries which were generated
            in the last 24 hours, and e-mail that summary to the set of
            space-separated addresses.  This daily summary is stored in
            the file $PCP_LOG_DIR/NOTICES.daily, which will be empty
            when no new ``notices'' entries were made in the previous 24
            hour period.

       -M   This option may be used to disable archive merging (or
            renaming) and rewriting (-M implies -r).  This is most
            useful in cases where the archives are being incrementally
            copied to a remote repository, e.g. using rsync(1).
            Merging, renaming and rewriting all risk an increase in the
            synchronization load, especially immediately after
            pmlogger_daily has run, so -M may be useful in these cases.

       -N, --showme
            This option enables a ``show me'' mode, where the programs
            actions are echoed, but not executed, in the style of ``make
            -n''.  Using -N in conjunction with -V maximizes the
            diagnostic capabilities for debugging.

       -o   By default all possible archives will be merged.  This
            option reinstates the old behaviour in which only
            yesterday's archives will be considered as merge candidates.
            In the special case where only a single input archive needs
            to be merged, pmlogmv(1) is used to rename the archive,
            otherwise pmlogger_merge(1) is used to merge all of the
            archives for a single host and a single day into a new PCP
            archive and the individual archives are removed.

       -p   If this option is specified for pmlogger_daily then the
            status of the daily processing is polled and if the daily
            pmlogger(1) rotation, culling, rewriting, compressing, etc.
            has not been done in the last 24 hours then it is done now.
            The intent is to have pmlogger_daily called regularly with
            the -p option (at 30 mins past the hour, every hour in the
            default cron(8) set up) to ensure daily processing happens
            as soon as possible if it was missed at the regularly
            scheduled time (which is 00:10 by default), e.g. if the
            system was down or suspended at that time.  With this option
            pmlogger_daily simply exits if the previous day's processing
            has already been done.  Note that this option is not used on
            platforms supporting systemd(1) because the
            pmlogger_daily.timer service unit specifies a timer setting
            with Persistent=true.  The -K and -p options to
            pmlogger_daily are mutually exclusive.

       -r, --norewrite
            This command line option acts as an override and prevents
            all archive rewriting with pmlogrewrite(1) independent of
            the presence of any rewriting rule files or directories.

       -R, --rewriteall
            Sometimes PMDA changes require all archives to be rewritten,
            not just the ones involved in any current merging.  This is
            required for example after a PCP upgrade where a new version
            of an existing PMDA has revised metadata.  The -R command
            line forces this universal-style of rewriting.  The -R
            option to pmlogger_daily is mutually exclusive with both the
            -r and -M options.

       -s size, --rotate=size
            If the PCP ``notices'' file ($PCP_LOG_DIR/NOTICES) is larger
            than 20480 bytes, pmlogger_daily will rename the file with a
            ``.old'' suffix, and start a new ``notices'' file.  The
            rotate threshold may be changed from 20480 to size bytes
            using the -s option.

       -t period
            To assist with debugging or diagnosing intermittent failures
            the -t option may be used.  This will turn on very verbose
            tracing (-VV) and capture the trace output in a file named
            $PCP_LOG_DIR/pmlogger/daily.datestamp.trace, where datestamp
            is the time pmlogger_daily was run in the format
            YYYYMMDD.HH.MM.  In addition, the period argument will
            ensure that trace files created with -t will be kept for
            period days and then discarded.

       -V, --verbose
            The output from the cron execution of the scripts may be
            extended using the -V option to the scripts which will
            enable verbose tracing of their activity.  By default the
            scripts generate no output unless some error or warning
            condition is encountered.  A second -V increases the
            verbosity.  Using -N in conjunction with -V maximizes the
            diagnostic capabilities for debugging.

       -x time, --compress-after=time
            Archive data files can optionally be compressed after some
            period to conserve disk space.  This is particularly useful
            for large numbers of pmlogger processes under the control of
            pmlogger_daily.

            time is a time specification in the syntax of
            find-filter(1), so DD[:HH[:MM]].  The optional HH (hours)
            and MM (minutes) parts are 0 if not specified.

            Some special values are recognized for the time, namely 0 to
            apply compression as soon as possible, and forever or never
            to prevent any compression being done.

            If transparent_decompress is enabled when libpcp was built
            (can be checked with the pmconfig(1) -L option), then the
            default behaviour is compression ``as soon as possible''.
            Otherwise the default behaviour is to not compress files
            (which matches the historical default behaviour in earlier
            PCP releases).

            The time can also be set using the $PCP_COMPRESSAFTER
            variable, set in either the environment or in a control
            file.  If both $PCP_COMPRESSAFTER and -x specify different
            values for time then the environment variable value is used
            and a warning is issued.  For important other detailed notes
            concerning volume compression, see the -K and -k options
            (above).

       -X program, --compressor=program
            This option specifies the program to use for compression -
            by default this is xz(1).  The environment variable
            $PCP_COMPRESS may be used as an alternative mechanism to
            define program.  If both $PCP_COMPRESS and -X specify
            different compression programs then the environment variable
            value is used and a warning is issued.

       -Y regex, --regex=regex
            This option allows a regular expression to be specified
            causing files in the set of files matched for compression to
            be omitted - this allows only the data file to be
            compressed, and also prevents the program from attempting to
            compress it more than once.  The default regex is
            ".(index|Z|gz|bz2|zip|xz|lzma|lzo|lz4)$" - such files are
            filtered using the -v option to egrep(1).  The environment
            variable $PCP_COMPRESSREGEX may be used as an alternative
            mechanism to define regex.  If both $PCP_COMPRESSREGEX and
            -Y specify different values for regex then the environment
            variable value is used and a warning is issued.

       -z   This option causes pmlogger_daily to not ``re-exec'', see
            pmlogger(1), when it would otherwise choose to do so and is
            intended only for QA testing.

       -Z   This option causes pmlogger_daily to ``re-exec'', see
            pmlogger(1), whenever that is possible and is intended only
            for QA testing.

       -?, --help
            Display usage message and exit.

CALLBACKS         top

       Additionally pmlogger_daily supports the following ``hooks'' to
       allow auxiliary operations to be performed at key points in the
       daily processing of the archives.  These callbacks are controlled
       via variables that may be set in the environment or via the
       control file.

       Note that merge callbacks and autosaving described below are not
       enabled when only compression tasks are being attempted, i.e.
       when -K command line option is used.

       All of the callback script execution and the autosave file moving
       will be executed as the non-privileged user ``pcp'' and group
       ``pcp'', so appropriate permissions may need to have been set up
       in advance.

       $PCP_MERGE_CALLBACK
            As each day's archive is created by merging and before any
            compression takes place, if $PCP_MERGE_CALLBACK is defined,
            then it is assumed to be a script that will be called with
            one argument being the name of the archive (stripped of any
            suffixes), so something of the form
            /some/directory/path/YYYYMMDD.  The script needs to be
            either a full pathname, or something that will be found on
            the shell's $PATH .  The callback script will be run in the
            foreground, so pmlogger_daily will wait for it to complete.

            If the control file contains more than one
            $PCP_MERGE_CALLBACK specification then these will be run
            serially in the order they appear in the control file.  If
            $PCP_MERGE_CALLBACK is defined in the environment when
            pmlogger_daily is run, this is treated as though this option
            was the first in the control file, i.e. it will be run
            before any merge callbacks mentioned in the control file.

            If the pcp-zeroconf packages is installed, then a special
            merge callback is added to call pmlogger_daily_report(1)
            first, before any other merge callback options.  Refer to
            pmlogger_daily_report(1) for an explanation of the pcp-
            zeroconf requirements.

            If pmlogger_daily is in ``catch up'' mode (more than one
            day's worth of archives need to be combined) then each call
            back is executed once for each day's archive that is
            generated.

            A typical use might be to produce daily reports from the PCP
            archive which needs to wait until the archive has been
            created, but is more efficient if it is done before any
            potential compression of the archive.

       $PCP_COMPRESS_CALLBACK
            If pmlogger_daily is run with -x 0 or $PCP_COMPRESSAFTER=0,
            then compression is done immediately after merging.  As each
            day's archive is compressed, if $PCP_COMPRESS_CALLBACK is
            defined, then it is assumed to be a script that will be
            called with one argument being the name of the archive
            (stripped of any suffixes), so something of the form
            /some/directory/path/YYYYMMDD.  The script needs to be
            either a full pathname, or something that will be found on
            the shell's $PATH .  The callback script will be run in the
            foreground, so pmlogger_daily will wait for it to complete.

            If the control file contains more than one
            $PCP_COMPRESS_CALLBACK specification then these will be run
            serially in the order they appear in the control file.  If
            $PCP_COMPRESS_CALLBACK is defined in the environment when
            pmlogger_daily is run, this is treated as though this option
            was the first in the control file, i.e. it will be run
            first.

            If pmlogger_daily is in ``catch up'' mode (more than one
            day's worth of archives need to be compressed) then each
            call back is executed once for each day's archive that is
            compressed.

            A typical use might be to keep recent archives in
            uncompressed form for efficient querying, but move the older
            archives to some other storage location once the compression
            has been done.

       $PCP_AUTOSAVE_DIR
            Once the merging and possible compression has been done by
            pmlogger_daily, if $PCP_AUTOSAVE_DIR is defined then all of
            the physical files that make up one day's archive will be
            moved (autosaved) to the directory specified by
            $PCP_AUTOSAVE_DIR.

            The basename of the archive is used to set the reserved
            words DATEYYYY (year), DATEMM (month) and DATEDD (day) and
            these (along with LOCALHOSTNAME) may appear literally in
            $PCP_AUTOSAVE_DIR, and will be substituted at execution time
            to generate the destination directory name.  For example:
                  $PCP_AUTOSAVE_DIR=/gpfs/LOCALHOSTNAME/DATEYYYY/DATEMM-
                  DATEDD

            Note that these ``date'' reserved words correspond to the
            date on which the archive data was collected, not the date
            that pmlogger_daily was run.

            If $PCP_AUTOSAVE_DIR (after LOCALHOSTNAME and ``date''
            substitution) does not exist then pmlogger_daily will
            attempt to create it (along with any parent directories that
            do not exist).  Just be aware that this directory creation
            runs under the uid of the user ``pcp'', so directories along
            the path to $PCP_AUTOSAVE_DIR may need to be writeable by
            this non-root user.

            By ``move'' the archives we mean a paranoid checksum-copy-
            checksum-remove (using the -c option for pmlogmv(1)) that
            will bail if the copy fails or the checksums do not match
            (the archives are important so we cannot risk something like
            a full filesystem or a permissions issue messing with the
            copy process).

            If pmlogger_daily is in ``catch up'' mode (more than one
            day's worth of archives need to be combined) then the
            archives for more than one day could be copied in this step.

            A typical use might be to create PCP archives on a local
            filesystem initially, then once all the data for a single
            day has been collected and merged, migrate that day's
            archive to a shared filesystem or a remote filesystem.  This
            may allow automatic backup to off-site storage and/or reduce
            the number of I/O operations and filesystem metadata
            operations on the (potentially slower) non-local filesystem.

CONFIGURATION         top

       Refer to pmlogger.control(5) for a description of the contol
       file(s) that are used to control which pmlogger instances and
       which archives are managed by pmlogger_check and
       pmlogger_daily(1).

FILES         top

       $PCP_VAR_DIR/config/pmlogger/config.default
            default pmlogger configuration file location for the local
            primary logger, typically generated automatically by
            pmlogconf(1).

       $PCP_ARCHIVE_DIR/<hostname>
            default location for archives of performance information
            collected from the host hostname

       $PCP_ARCHIVE_DIR/<hostname>/lock
            transient lock file to guarantee mutual exclusion during
            pmlogger administration for the host hostname - if present,
            can be safely removed if neither pmlogger_daily nor
            pmlogger_check(1) are running

       $PCP_ARCHIVE_DIR/<hostname>/Latest
            PCP archive folio created by mkaf(1) for the most recently
            launched archive containing performance metrics from the
            host hostname

       $PCP_LOG_DIR/NOTICES
            PCP ``notices'' file used by pmie(1) and friends

       $PCP_LOG_DIR/pmlogger/pmlogger_daily.log
            if the previous execution of pmlogger_daily produced any
            output it is saved here.  The normal case is no output in
            which case the file does not exist.

       $PCP_ARCHIVE_DIR/SaveLogs
             if this directory exists, then the log file from the -l
             argument for pmlogger_daily will be saved in this directory
             with the name of the format <date>-pmlogger_daily.log.<pid>
             or <date>-pmlogger_daily-K.log.<pid> This allows the log
             file to be inspected at a later time, even if several
             pmlogger_daily executions have been launched in the
             interim.  Because the PCP archive management tools run
             under the $PCP_USER account ``pcp'',
             $PCP_ARCHIVE_DIR/SaveLogs typically needs to be owned by
             the user ``pcp''.

       $PCP_ARCHIVE_DIR/<hostname>/SaveLogs
              if this directory exists, then the log file from the -l
              argument of a newly launched pmlogger(1) for hostname will
              be saved in this directory with the name archive.log where
              archive is the basename of the associated pmlogger(1) PCP
              archive files.  This allows the log file to be inspected
              at a later time, even if several pmlogger(1) instances for
              hostname have been launched in the interim.  Because the
              PCP archive management tools run under the uid of the user
              ``pcp'', $PCP_ARCHIVE_DIR/<hostname>/SaveLogs typically
              needs to be owned by the user ``pcp''.

       $PCP_LOG_DIR/pmlogger/.NeedRewrite
               if this file exists, then this is treated as equivalent
               to using -R on the command line and the file will be
               removed once all rewriting has been done.

PCP ENVIRONMENT         top

       Environment variables with the prefix PCP_ are used to
       parameterize the file and directory names used by PCP.  On each
       installation, the file /etc/pcp.conf contains the local values
       for these variables.  The $PCP_CONF variable may be used to
       specify an alternative configuration file, as described in
       pcp.conf(5).

COMPATIBILITY ISSUES         top

       Earlier versions of pmlogger_daily used find(1) to locate files
       for compressing or culling and the -k and -x options took only
       integer values to mean ``days''.  The semantics of this was quite
       loose given that find(1) offers different precision and semantics
       across platforms.

       The current implementation of pmlogger_daily uses find-filter(1)
       which provides high precision intervals and semantics that are
       relative to the time of execution and are consistent across
       platforms.

SEE ALSO         top

       egrep(1), find-filter(1), PCPIntro(1), pmconfig(1), pmlc(1),
       pmlogconf(1), pmlogctl(1), pmlogextract(1), pmlogger(1),
       pmlogger_check(1), pmlogger_daily_report(1), pmlogger_merge(1),
       pmlogmv(1), pmlogrewrite(1), systemd(1), xz(1) and cron(8).

COLOPHON         top

       This page is part of the PCP (Performance Co-Pilot) project.
       Information about the project can be found at 
       ⟨http://www.pcp.io/⟩.  If you have a bug report for this manual
       page, send it to [email protected].  This page was obtained from the
       project's upstream Git repository
       ⟨https://github.com/performancecopilot/pcp.git⟩ on 2024-06-14.
       (At that time, the date of the most recent commit that was found
       in the repository was 2024-06-14.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there
       is a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to
       [email protected]

Performance Co-Pilot               PCP                 PMLOGGER_DAILY(1)

Pages that refer to this page: find-filter(1)pcp-atop(1)pcp-atopsar(1)pcpintro(1)pmlc(1)pmlogctl(1)pmlogdump(1)pmlogextract(1)pmlogger(1)pmlogger_check(1)pmlogger_daily(1)pmlogger_daily_report(1)pmlogger_merge(1)pmlogger_rewrite(1)pmloglabel(1)pmsearch(1)pmsnap(1)pmdiscoversetup(3)LOGARCHIVE(5)pmlogger.control(5)