Использование таблицы или сценария в sed для замены многих специальных символов на escape-символы?

Question

Если вы хотите заменить специальные символы с помощью sed, вы можете использовать разные способы, но проблема в том, что вам нужно заменить многие (более 100) специальных символов на escape-символы во многих файлах.

так что нужно: (спасибо Питер)

^^ чтобы избежать одного ^
^| убежать |
\& убежать &
\/ убежать /
\\ \ Убежать \

Предположим, у вас есть более 100 примеров строк во многих файлах:

sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
.....
.....

эти строки, содержащие много специальных символов для выхода (у нас есть более 100 строк)..
Экранирование вручную - это очень долгая работа. Поэтому мне нужно создать табличный скрипт, похожий на wReplace, чтобы вызвать в командной строке экранирование специальных символов и затем заменить их моими словами.
Как я могу сделать?

score 2 · Accepted Answer · 2011-03-09T19:38:42

Обратите внимание, что ^^ для ^ и ^| для | и ^& for & ... не являются обязательными для sed . Escape-символ ^ требуется CMD-оболочкой. Если текст подвергается ни в командной строке , ни параметра команды в CMD - /.bat командной сценарий, вам нужно только рассмотреть СЕПГ побег-символ , который является обратной косой черты \ ... Это две совершенно разные области видимости (которые могут пересекаться, поэтому часто лучше оставить все это в рамках области действия sed, как показано ниже.

Вот скрипт sed который заменит любое количество строк поиска, которые вы разделяете, их дополнительными строками замены. Общий формат строк представляет собой нечто среднее между командой подстановки sed(s/abc/xyz/p) и табличным форматом. Вы можете "растянуть" средний разделитель, чтобы вы могли выстроить в ряд.
Вы можете использовать FIXED строковый шаблон (F/...) или обычный шаблон регулярного выражения в стиле sed(s/...) ... и вы можете настроить sed -n и каждый /p(в table.txt). ) по мере необходимости.

Для минимального запуска вам нужно 3 файла (и 4-й, динамически извлекаемый из table.txt):

основной скрипт- таблица-regex.sed
файл таблицы table.txt
целевой файл file-to-chanage.txt
derrived script table-derrived.sed

Чтобы запустить одну таблицу против одного целевого файла.

sed -nf table-to-regex.sed  table.txt > table-derrived.sed
# Here, check `table-derrived.sed` for errors as described in the example *table.txt*.  

sed -nf table-derrived.sed  file-to-change.txt
# Redirect *sed's* output via `>` or `>>` as need be, or use `sed -i -nf`

Если вы хотите запустить table.txt для многих файлов, просто поместите приведенный выше фрагмент кода в простой цикл в соответствии с вашими требованиями. Я могу сделать это тривиально в bash, но кто-то, более осведомленный о CMD-оболочке Windows, подойдет для этого лучше, чем я.

Вот сценарий: таблица-regex.sed

s/[[:space:]]*$//  # remove trailing whitespace

/^$\|^[[:space:]]*#/{p; b}  # empty and sed-style comment lines: print and branch
                            # printing keeps line numbers; for referencing errors

/^\([Fs]\)\(.\)\(.*\2\)\{4\}/{  # too many delims ERROR
      s/^/# error + # /p        # print a flagged/commented error
      b }                       # branch

/^\([Fs]\)\(.\)\(.*\2\)\{3\}/{                  # this may be a long-form 2nd delimiter
   /^\([Fs]\)\(.\)\(.*\2[[:space:]]*\2.*\2\)/{  # is long-form 2nd delimiter OK?
      s/^\([Fs]\)\(.\)\(.*\)\2[[:space:]]*\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                                      # branch on true to :OK
   }; s/^/# error L # /p                        # print a flagged/commented error
      b }                                       # branch: long-form 2nd delimiter ERROR

/^\([Fs]\)\(.\)\(.*\2\)\{2\}/{     # this may be short-form delimiters
   /^\([Fs]\)\(.\)\(.*\2.*\2\)/{   # is short-form delimiters OK?
      s/^\([Fs]\)\(.\)\(.*\)\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                         # branch on true to :OK  
   }; s/^/# error S # /p           # print a flagged/commented error
      b }                          # branch: short-form delimiters ERROR

{ s/^/# error - # /p        # print a flagged/commented error
  b }                       # branch: too few delimiters ERROR

:OK     # delimiters are okay
#============================
h   # copy the pattern-space to the hold space

# NOTE: /^s/ lines are considered to contain regex patterns, not FIXED strings.
/^s/{    s/^s\(.\)\n/s\1/   # shrink long-form delimiter to short-form
     :s; s/^s\(.\)\([^\n]*\)\n/s\1\2\1/; t s  # branch on true to :s 
      p; b }                                  # print and branch

# The following code handles FIXED-string /^F/ lines

s/^F.\n\([^\n]*\)\n.*/\1/  # isolate the literal find-string in the pattern-space
s/[]\/$*.^|[]/\\&/g        # convert the literal find-string into a regex of itself
H                          # append \n + find-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

s/^F.\n[^\n]*\n\([^\n]*\)\n.*/\1/  # isolate the literal repl-string in the pattern-space
s/[\/&]/\\&/g                      # convert the literal repl-string into a regex of itself
H                                  # append \n + repl-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

# Rearrange pattern-space into a / delimited command: s/find/repl/...      
s/^\(F.\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)$/s\/\5\/\6\/\4/

p   # Print the modified find-and-replace regular expression line

Вот пример файла таблицы с описанием того, как это работает: table.txt

# The script expects an input table file, which can contain 
#   comment, blank, and substitution lines. The text you are
#   now reading is part of an input table file.

# Comment lines begin with optional whitespace followed by #

# Each substitution line must start with: 's' or 'F'
#  's' lines are treated as a normal `sed` substitution regular expressions
#  'F' lines are considered to contain `FIXED` (literal) string expressions 
# The 's' or 'F' must be followed by the 1st of 3 delimiters   
#   which must not appear elsewhere on the same line.
# A pre-test is performed to ensure conformity. Lines with 
#   too many or too few delimiters, or no 's' or 'F', are flagged   
#   with the text '# error ? #', which effectively comments them out.
#   '?' can be: '-' too few, '+' too many, 'L' long-form, 'S' short-form
#   Here is an example of a long-form error, as it appears in the output. 

# error L # s/example/(7+3)/2=5/

# 1st delimiter, eg '/' must be a single character.
# 2nd (middle) delimiter has two possible forms:
#   Either it is exactly the same as the 1st delimiter: '/' (short-form)
#   or it has a double-form for column alignment: '/      /' (long-form)
#   The long-form can have any anount of whitespace between the 2 '/'s   
# 3rd delimiter must be the same as the 1st delimiter,

# After the 3rd delimiter, you can put any of sed's 
#    substitution commands, eg. 'g'

# With one condition, a trailing '#' comment to 's' and 'F' lines is
#    valid. The condition is that no delimiter character can be in the 
#    comment (delimiters must not appear elsewhere on the same line)

# For 's' type lines, it is implied that *you* have included all the 
#    necessary sed-escape characters!  The script does not add any 
#    sed-escape characters for 's' type lines. It will, however, 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# For 'F' type lines, it is implied that both strings (find and replace) 
#    are FIXED/literal-strings. The script does add the  necessary 
#    sed-escape characters for 'F' type lines. It will also 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# The result is a sed-script which contains one sed-substitution 
#    statement per line; it is just a modified version of your 
#    's' and 'F' strings "table" file.

# Note that the 1st delimiter is *always* in column 2.

# Here are some sample 's' and 'F' lines, with comments:
#

F/abc/ABC/gp               #-> These 3 are the same for 's' and 'F', 
s/abc/ABC/gp               #-> as no characters need to be escaped,  
s/abc/         /ABC/gp     #-> and the 2nd delimiter shrinks to one  

F/^F=Fixed/    /\1okay/p   # \1 is okay here, It is a FIXED literal
s|^s=sed regex||\1FAIL|p   # \1 will FAIL: back-reference not defined!

F|\\\\|////|               # this line == next line 
F|\\\\|        |////|p     # this line == previous line  
s|\\\\|        |////|p     # this line is different; 's' vs 'F'

F_Hello! ^.&`//\\*$/['{'$";"`_    _Ciao!_   # literal find / replace

Вот пример входного файла, текст которого вы хотите изменить: file-to-chanage.txt

abc abc
^F=Fixed
   s=sed regex
\\\\ \\\\ \\\\ \\\\
Hello! ^.&`//\\*$/['{'$";"`
some non-matching text

Сейчас выбран русский

Использование таблицы или сценария в sed для замены многих специальных символов на escape-символы?

1 ответ1

Всё ещё ищете ответ? Посмотрите другие вопросы с метками command-line sed.

Связанные

Использование таблицы или сценария в sed для замены многих специальных символов на escape-символы?

1 ответ1

Всё ещё ищете ответ? Посмотрите другие вопросы с метками command-line sed.

Связанные

Похожие