Как я могу извлечь текст между круглыми скобками, содержащими конкретное слово?

Question 1

Я нашел несколько сайтов, где они делают: sudo adduser paul admin

, но у моего Linux нет группы администратора, поэтому я использую:

sudo adduser paul sudo

Question 2

Использование sed:

< inputfile sed 's/(\([^\)]*\(bar\|blat\)[^\)]*\))/\1/g; s/(.*) //g'

Входной файл:

test (bar) (blat)
bar (testblat) (bartest)
blat (testbar) (barblat) (no) (blatanother)

Выходной файл:

test bar blat
bar testblat bartest
blat testbar barblat blatanother

Разбивка:

# 1:

(: соответствует символу ( \(: начинает группировать группу захвата [^\)]*: соответствует 0 или более символам, а не ) \(: начинается группировка допустимых строк bar: соответствует 1-й разрешенной строке \|: отделяет 2-ю разрешенную строку blat: соответствует второй разрешенной строке \): прекращает группировку допустимых строк [^\)]*: соответствует 0 или более символам, не ) \): останавливает группировку группы захвата ): соответствует символу )

# 2:

(: соответствует значению ( character .*: соответствует 0 или более символам \(: начинает группировать группу захвата : соответствует символу

Question 3

Question 4

Использование python:

#!/usr/bin/env python2
import re
with open('/path/to/file.txt') as f:
    for line in f:
        pat_list = re.findall(r'\(([^)]*?)\)', line.rstrip())
        for pat in pat_list:
            if not re.search(r'(?:blat|bar)', pat):
                print re.sub(r'\(|\)', '', line.replace(' ({0})'.format(pat), '').rstrip())

Выход:

foo bar 80
foo blat 92

Здесь мы использовали модуль re (регулярное выражение) python. pat_list будет содержать список строк в круглых скобках. Затем мы искали присутствие «blat» или «bar» в членах pat_list. Если не было найдено, мы напечатали строку, удаляющую ненужные части, включая круглые скобки.

Question 5

Используя awk: Сохраните следующий код в текстовом файле и сделайте его исполняемым (chmod u+x filename).

Затем запустите его так:

awk -f filename inputfile

Это огромно по сравнению с решениями в perl или python, я добавляю это только потому, что awk или sed было предпочтительным решением и показать, что можно использовать awk, хотя это не удобно.

{
#list of words to look for in parentheses: (named "w" to speed up adding items)
w[0] = "bar";
w[1] = "blat";

#"bool" value whether of not to crop spaces around omitted parenthesis with their content
cropSpaces = 1;

spaces = 0;                     #space counter used for cropping 
open = 0;                       #open/nested parenthesis counter
st = 0;                         #marks index where parenthesis starts
end = 0;                        #marks index where parenthesis ends
out = 0;                        #"bool" value indicating whether or not the word has been found
for(i = 1;i-1 < length($0);i++){     #for each character
  c = substr($0,i,1);                 #get character
  if(c == "("){                       #character is '('
    open++;                            #increment parenthesis counter
    if(open == 1) st = i+1;            #marks start of parenthesis (if not nested)
  }
  else if(c == ")"){                 #char is ')'
    open--;                           #decrement parenthesis counter
    if(open == 0) end = i;            #mark end of parenthesis (if not nested)
  }
  else{                             #any other char
    if(open == 0){                   #outside of parenthesis
      if(cropSpaces && c == " "){     #char is space (and cropSpaces option is not 0) 
        if(spaces == 0) printf c;      #print space if not sequential  
        spaces++;                      #increment space counter
      }
      else{                           #any other char
        spaces = 0;                    #set previous spaces counter to 0
        printf c;                      #print char
      }
    }
    else if(!out){                   #inside of parenthesis (and no word has been found)
      for(j = 0; j < length(w); j++){               #for every word in list
        if( substr( $0,i,length(w[j]) ) == w[j]){    #if word matches
          out = 1;                                    #word has been found
          break;                                      #do not look for any other words
        }
      }
    }
  }
  if(open == 0 && out){              #outside of parenthesis and word found in previous parenthesis
    printf substr($0,st,end-st);      #print content
    out = 0;                          #reset "word found" indicator 
    spaces = 0;                       #reset spaces counter
  }
}

printf "\n";                        #print newline
}

Question 6

немного поздно, но, что об этом, сила простоты:

> cat test.py
from string import replace

stuff = '''
foo (blah) (bar 80)
foo (cats) (blat 92)
'''

for i in stuff.split('\n'):  # split by \n
  if i != str():  # not empty string
    print ''.join(i.split()[0]+' '+i.split()[2]+' '+i.split()[3]).replace('(','').replace(')','')

>>> python test.py 
foo bar 80
foo blat 92

kos · Answer 1 · 23 May 2018 в 20:51