Я пытаюсь использовать сервис Amazon S3 для хранения logs из моих приложений. Учитывая /user/bin/s3cmd --help говорит мне, что мне нужно знать, как отправить файлы:

s3cmd --help
usage: s3cmd [options] COMMAND [parameters]

S3cmd is a tool for managing objects in Amazon S3 storage. It allows for
making and removing "buckets" and uploading, downloading and removing
"objects" from these buckets.

options:
  -h, --help            show this help message and exit
  --configure           Invoke interactive (re)configuration tool.
  -c FILE, --config=FILE
                        Config file name. Defaults to
                        /home/valter.silva/.s3cfg
  --dump-config         Dump current configuration after parsing config files
                        and command line options and exit.
  -n, --dry-run         Only show what should be uploaded or downloaded but
                        don't actually do it. May still perform S3 requests to
                        get bucket listings and other information though (only
                        for file transfer commands)
  -e, --encrypt         Encrypt files before uploading to S3.
  --no-encrypt          Don't encrypt files.
  -f, --force           Force overwrite and other dangerous operations.
  --continue            Continue getting a partially downloaded file (only for
                        [get] command).
  --skip-existing       Skip over files that exist at the destination (only
                        for [get] and [sync] commands).
  -r, --recursive       Recursive upload, download or removal.
  --check-md5           Check MD5 sums when comparing files for [sync].
                        (default)
  --no-check-md5        Do not check MD5 sums when comparing files for [sync].
                        Only size will be compared. May significantly speed up
                        transfer but may also miss some changed files.
  -P, --acl-public      Store objects with ACL allowing read for anyone.
  --acl-private         Store objects with default ACL allowing access for you
                        only.
  --acl-grant=PERMISSION:EMAIL or USER_CANONICAL_ID
                        Grant stated permission to a given amazon user.
                        Permission is one of: read, write, read_acp,
                        write_acp, full_control, all
  --acl-revoke=PERMISSION:USER_CANONICAL_ID
                        Revoke stated permission for a given amazon user.
                        Permission is one of: read, write, read_acp, wr
                        ite_acp, full_control, all
  --delete-removed      Delete remote objects with no corresponding local file
                        [sync]
  --no-delete-removed   Don't delete remote objects.
  -p, --preserve        Preserve filesystem attributes (mode, ownership,
                        timestamps). Default for [sync] command.
  --no-preserve         Don't store FS attributes
  --exclude=GLOB        Filenames and paths matching GLOB will be excluded
                        from sync
  --exclude-from=FILE   Read --exclude GLOBs from FILE
  --rexclude=REGEXP     Filenames and paths matching REGEXP (regular
                        expression) will be excluded from sync
  --rexclude-from=FILE  Read --rexclude REGEXPs from FILE
  --include=GLOB        Filenames and paths matching GLOB will be included
                        even if previously excluded by one of
                        --(r)exclude(-from) patterns
  --include-from=FILE   Read --include GLOBs from FILE
  --rinclude=REGEXP     Same as --include but uses REGEXP (regular expression)
                        instead of GLOB
  --rinclude-from=FILE  Read --rinclude REGEXPs from FILE
  --bucket-location=BUCKET_LOCATION
                        Datacentre to create bucket in. As of now the
                        datacenters are: US (default), EU, us-west-1, and ap-
                        southeast-1
  --reduced-redundancy, --rr
                        Store object with 'Reduced redundancy'. Lower per-GB
                        price. [put, cp, mv]
  --access-logging-target-prefix=LOG_TARGET_PREFIX
                        Target prefix for access logs (S3 URI) (for [cfmodify]
                        and [accesslog] commands)
  --no-access-logging   Disable access logging (for [cfmodify] and [accesslog]
                        commands)
  -m MIME/TYPE, --mime-type=MIME/TYPE
                        Default MIME-type to be set for objects stored.
  -M, --guess-mime-type
                        Guess MIME-type of files by their extension. Falls
                        back to default MIME-Type as specified by --mime-type
                        option
  --add-header=NAME:VALUE
                        Add a given HTTP header to the upload request. Can be
                        used multiple times. For instance set 'Expires' or
                        'Cache-Control' headers (or both) using this options
                        if you like.
  --encoding=ENCODING   Override autodetected terminal and filesystem encoding
                        (character set). Autodetected: UTF-8
  --verbatim            Use the S3 name as given on the command line. No pre-
                        processing, encoding, etc. Use with caution!
  --list-md5            Include MD5 sums in bucket listings (only for 'ls'
                        command).
  -H, --human-readable-sizes
                        Print sizes in human readable form (eg 1kB instead of
                        1234).
  --progress            Display progress meter (default on TTY).
  --no-progress         Don't display progress meter (default on non-TTY).
  --enable              Enable given CloudFront distribution (only for
                        [cfmodify] command)
  --disable             Enable given CloudFront distribution (only for
                        [cfmodify] command)
  --cf-add-cname=CNAME  Add given CNAME to a CloudFront distribution (only for
                        [cfcreate] and [cfmodify] commands)
  --cf-remove-cname=CNAME
                        Remove given CNAME from a CloudFront distribution
                        (only for [cfmodify] command)
  --cf-comment=COMMENT  Set COMMENT for a given CloudFront distribution (only
                        for [cfcreate] and [cfmodify] commands)
  --cf-default-root-object=DEFAULT_ROOT_OBJECT
                        Set the default root object to return when no object
                        is specified in the URL. Use a relative path, i.e.
                        default/index.html instead of /default/index.html or
                        s3://bucket/default/index.html (only for [cfcreate]
                        and [cfmodify] commands)
  -v, --verbose         Enable verbose output.
  -d, --debug           Enable debug output.
  --version             Show s3cmd version (1.0.0) and exit.
  -F, --follow-symlinks
                        Follow symbolic links as if they are regular files

Но это не говорит, как проверить, был ли файл отправлен и удалить отправленные. Должен ли я проверить через MD5 и удалить локально с помощью какого-либо сценария shell ?

3 ответа3

1

FWIW, мне нужно было сделать что-то подобное и написал следующий скрипт bash. Что это делает:

  1. получает список файлов в каталоге, которые старше, чем $ MINUTES минут, используя find
  2. использует lsof чтобы определить, открыт ли файл (это может быть неверно, если файл, скажем, открыт редактором)
  3. использует s3cmd для копирования файла в корзину S3.
  4. сравнивает суммы MD5 для удаленного файла в S3 и локального. Если они проверяют, удалите местный.

-

#!/bin/bash
MINUTES=60
TARGET_DIR="s3://AWSbucketname/subfolder/`hostname -s`/"
LOCAL_DIR="/path/to/folder"
FILES=()

echo ""
echo "About to upload files in $LOCAL_DIR up to S3 folder:"
echo "    $TARGET_DIR"
echo "Then delete if MD5 sums line up."
echo "Starting in 5 seconds..."
sleep 5

cd $LOCAL_DIR

# Throw the list of files that the find command gets into an array
while IFS= read -d $'\0' -r file ; do
    FILES=("${FILES[@]}" "$file")
done < <(find $LOCAL_DIR -name \*.wav -mmin +$MINUTES -print0)

# echo "${WAV_FILES[@]}"   # DEBUG

for local_file in "${WAV_FILES[@]}"
do
    # Check that the file in question is not open.
    # lsof returns non-zero return value for file not in use
    lsof "$local_file" 2>&1 > /dev/null
    if test $? -ne 0 ; then
        echo ""
        echo "$local_file isn't open. Copying to S3..."
        s3cmd -p put $local_file $TARGET_DIR
        # s3cmd -n put $local_file $TARGET_DIR # DEBUG - dry-run

        ## Now attempt to delete if the MD5 sums check out:

        remote_file=${local_file##*/}
        md5sum_remote=`s3cmd info  "$TARGET_DIR$remote_file" | grep MD5 | awk '{print $3}'`
        md5sum_local=`md5sum $local_file | awk '{print $1}'`
        if [[ "$md5sum_remote" == "$md5sum_local" ]]; then
          echo "$remote_file MD5 sum checks out. Deleting..."
          rm $local_file
        fi
    fi
done
1

Через некоторое время я смог разработать код на bash который проверяет md5sum из обоих, s3 и моих local файлов и удаляет local файлы, которые уже есть в amazon s3:

#!/bin/bash
datacenter="amazon"
hostname=`hostname`;
path="backup/server245"

s3=`s3cmd ls --list-md5 -H s3://company-backup/company/"$datacenter"/"$hostname"/"$path"/`

s3_list=`echo "$s3"|awk {'print $4" "$5'} | sed 's= .*/= ='`

locally=`md5sum /"$path"/*.gz`;
locally_list=$(echo "$locally" | sed 's= .*/= =');
#echo "$locally_list";

IFS=$'\n'
for i in $locally_list
do
  #echo $i
  locally_hash=`echo $i|awk {'print $1'}`
  locally_file=`echo $i|awk {'print $2'}`

  for j in $s3_list
  do
    s3_hash=$(echo $j|awk {'print $1'}); 
    s3_file=$(echo $j|awk {'print $2'});

    #to avoid empty file when have only hash from folder
    if [[ $s3_hash != "" ]] && [[ $s3_file != "" ]]; then 
      if [[ $s3_hash == $locally_hash ]] && [[ $s3_file == $locally_file ]]; then
        echo "### REMOVING ###";
        echo "$locally_file";
        #rm /"$path"/"$locally_file";
      fi
    fi
  done
done
unset IFS
0

Из официальной документации:

--delete-after (Выполнить удаление после новой загрузки [синхронизация])

или же

--delete-after-fetch (Удалить удаленные объекты после извлечения в локальный файл (только для команд [get] и [sync]).)

если вы хотите синхронизировать с удаленного на локальный

https://s3tools.org/usage

Всё ещё ищете ответ? Посмотрите другие вопросы с метками .