歡迎您光臨本站 註冊首頁

關於對dtree的詳細分析

←手機掃碼閱讀     火星人 @ 2014-03-26 , reply:0

:0)1
要查看原文,請進我的博客:http://blog.sina.com.cn/baoxiaopan
這些天經過和阿城同學的激烈討論,終於對下面的腳本有了大致明確的理解,雖然是簡簡單單的兩行,但是涉及面還是很廣的,學習就要有不止的學習慾望和不息的學習熱情,我會把我的一些理解闡述如下,可能有些地方表述的不是很清楚,但是我還是認為我已經表達了需要表達的意思,如果發現有錯誤或者有補充的,歡迎 email告訴我,不甚感激。當然如果有不明確的,也歡迎隨時email或者QQ聯繫我:[url=xiaopan3322@gmail.com]xiaopan3322@gmail.com[/url],157526632。
Description
dtree is a utility that will display a directory hierarchy or tree.
While Linux comes with hundreds of utilities, something you got used to on another system always seems to be missing. One program in this category is something that will display a directory hierarchy or tree.
While some file managers that run under X-Windows will do this sort of task, it is sometimes very handy to have a command-line version. While not Linux-specific, the dtree utility is such a program.
I will first explain how to use dtree, then explain how it works. If you invoke it by just entering its name it will display the directory hierarchy starting at the current directory. If you invoke it with an argument, that argument is used as the starting directory. For example, if you enter dtree /home/fyl/Cool, a tree of directories under /home/fyl/Cool will be displayed.
dtree is written in the finest old-time Unix tradition using common utilities with a short shell script to glue them together. Here is the program:

腳本代碼:
CODE:
#!/bin/bash
# print a hierarchy tree starting at
# specified directory (. default)
(cd ${1-.}; pwd)
find ${1-.} -type d -print | sort -f | sed -e "s,^${1-.},," -e "/^$/d" -e 's,[^/]*/\([^/]*\)$,`-----\1,' -e"s,[^/]*/,| ,g"


腳本分析:
1,第一句話(cd ${1-.}; pwd),是為了放在一個sub-shell中執行兩句腳本,這樣的好處很明顯,不會跑到別的路徑中去,如果是非root用戶,就避免了一些不必要的許可權問題,而且sub-shell中很有用的一個好處是,執行結果和環境變數不會返回給父進程,這樣就保證了獨立性,不會影響到父進程。因此這句話的意圖也就很明顯了:為了顯示你要查找的目錄,所以才使用了sub-shell。
2,第一句中的${1-.},其實是一種選擇,此條命令其實是有參數的,即用戶需要查看的路徑名,如果用戶輸入了路徑,那麼程序就會選擇$1,如果用戶沒有輸入路徑參數,那麼程序會自動引用當前目錄,即.目錄。
3,find ${1-.} -type d -print | sort -f,這句話很簡單,就是為了查找用戶輸入目錄(或是當前目錄)下的所有目錄,並不care大小寫從a-z排序。
4,第二個管道的分析:
4.1,第一個-e是將輸入的參數目錄(或者當前目錄)替換為空行,以I為例,執行到第一個-e為止的結果為:
tdlteman@hzling06:~$ sh dtree.sh bak_config
/home/tdlteman/bak_config

/bak_script
/bak_script/dos2unix-3.1
/bak_script/dos2unix-3.1/dos2unix-3.1
/bak_script/L2_xp
/bak_script/test
/configFiles
可見,第二行是一個空行
4.2,第二個-e是將空行刪除,目的是為了刪除之前形成的那個空行,以I為例,執行到第二個-e后的結果為:
/home/tdlteman/bak_config
/bak_script
/bak_script/dos2unix-3.1
/bak_script/dos2unix-3.1/dos2unix-3.1
/bak_script/L2_xp
/bak_script/test
/configFiles
可見,第二行的空行已經刪除
4.3,理解第三個-e的關鍵是$和\(..\) 的用法,在sed中,$的作用是要錨定行的結束如:/sed$/匹配所有以sed結尾的行;而\(..\)的作用是要保存匹配的字元,如s/\(love \)able/\1rs,loveable被替換成lovers。因此[^/]*/\([^/]*\)$的意思是:錨定只要不是以/結尾的行,具體點說就是在最後一個字元前一定要出現一個/,至於是不是以/開頭的無關緊要,這句話的目的,其實是為了找出後面標記為1的字串。以I為例,第一個找到的應該是 /bak_script這一行,並且在符合這樣的 pattern的行中繼續查找不以/開頭並且以任意個字元結尾的字串,並且保存符合這樣的pattern的字串並標誌為1,以備之後的替換用,在此例中,第一個匹配並標誌為1的字串為bak_script,接著就會以`-----bak_script去替換bak_script,以此類推,由於這裡的替換沒有/g參數,因此每行只操作一次,並沒有對整行進行操作。以I為例,執行到第三個-e為止的結果為:
/home/tdlteman/bak_config
`-----bak_script
/`-----dos2unix-3.1
/bak_script/`-----dos2unix-3.1
/`-----L2_xp
/`-----test
`-----configFiles
可見,hiberarchy結構已經基本形成。
有興趣的朋友可以試一試去掉第一個[^/]*/的情況,即變為"s,\([^/]*\)$,\`-----\1,"的情況,這裡可以貼出我的測試結果:
/home/tdlteman/bak_config
/`-----bak_script
/bak_script/`-----dos2unix-3.1
/bak_script/dos2unix-3.1/`-----dos2unix-3.1
/bak_script/`-----L2_xp
/bak_script/`-----test
/`-----configFiles
因此最終結果就成了
/home/tdlteman/bak_config
| `-----bak_script
| | `-----dos2unix-3.1
| | | `-----dos2unix-3.1
| | `-----L2_xp
| | `-----test
| `-----configFiles
現象很明顯,多了一個第二句開始每句都多了一個/,最終結果也就多了最外面的一層「| 」。我們可以簡單分析下,如果去掉了[^/]*/這句,那麼關於/的匹配就沒有了,只能等到第四個-e去匹配了,因此可以想象,執行完去掉[^/]*/后的結果總會被不去掉的結果多一個/,因此也就多了一次「| 」的替換。因此這句腳本的目的是為了保證每次要替換的行中,比原來的行多去掉一個/(包括/之前的字元)。
4.4,最後一個-e,和第三個-e類似,是為了把不是以/開頭但是要以/結尾的字串替換為| ,在這裡其實就是指以/結尾的字串,因為即使是開頭的/也會被替換(可看做是一種特殊情況),因此執行完所有的-e操作后,就會形成最終的結果:
/home/tdlteman/bak_config
`-----bak_script
| `-----dos2unix-3.1
| | `-----dos2unix-3.1
| `-----L2_xp
| `-----test
`-----configFiles

可見,hiberarchy結構已經成型,非常的有層次感。
5,對於's,[^/]*/\([^/]*\)$,`-----\1,'這句話,其實硬引用''也可以修改為軟引用"",如果用了軟引用,那麼`的寫法就需要加上轉義字元\,此句話就變為"s,[^/]*/\([^/]*\)$,`-----\1,"
在這裡值得注意的是,執行"sh dtree.sh bak_config"和"sh dtree.sh bak_config/"的結果是不一樣的,有著細微的差別,原因很明顯,因為字串匹配的條件變了,這裡就不做具體的分析,有興趣的可以自己分析。其實過程完全一樣。這裡只附上執行結果。


示例:
I:
#####執行sh dtree.sh bak_config后的結果:
###只執行第一個管道(沒有執行sed一句)的執行結果:
tdlteman@hzling06:~$ sh dtree.sh bak_config
/home/tdlteman/bak_config
bak_config
bak_config/bak_script
bak_config/bak_script/dos2unix-3.1
bak_config/bak_script/dos2unix-3.1/dos2unix-3.1
bak_config/bak_script/L2_xp
bak_config/bak_script/test
bak_config/configFiles
###整段腳本的運行結果:
tdlteman@hzling06:~$ sh dtree.sh bak_config
/home/tdlteman/bak_config
`-----bak_script
| `-----dos2unix-3.1
| | `-----dos2unix-3.1
| `-----L2_xp
| `-----test
`-----configFiles

II:
#####執行sh dtree.sh bak_config/的結果:
###只執行第一個管道(沒有執行sed一句)的執行結果:
tdlteman@hzling06:~$ sh dtree.sh bak_config/
/home/tdlteman/bak_config
bak_config/
bak_config/bak_script
bak_config/bak_script/dos2unix-3.1
bak_config/bak_script/dos2unix-3.1/dos2unix-3.1
bak_config/bak_script/L2_xp
bak_config/bak_script/test
bak_config/configFiles
###整段腳本的運行結果:
tdlteman@hzling06:~$ sh dtree.sh bak_config/
/home/tdlteman/bak_config
bak_script
`-----dos2unix-3.1
| `-----dos2unix-3.1
`-----L2_xp
`-----test
configFiles


最後給出作者的解釋,有興趣的可以參考下:
The first line in the output is the name of the directory dtree was run on. This line was produced by the line that begins with (cd. Breaking this line down:
*
${1-.} means use the first argument from the command line ($1) if it is available, otherwise use . which is a synonym for the current directory. Thus, the cd command either changes to the directory specified on the line that invoked dtree or to the current directory (a virtual no-op).
*
pwd then displays the path name of the current directory.
*
The parentheses around the whole line force the command to be run in a subshell. This means the cd command is local to this line and subsequent commands will be executed from what was the current directory when dtree was initially invoked.
*
The find command prints out all files whose type is d (for directory). The same directory reference is used as in cd.
*
The output of find is piped into find and the -f option tells sort to fold upper and lower case names together.
*
The tricky formatting of the tree is done by sed in four steps. Each step is set off by -e. This is how you tell sed a program follows.
*
The first expression_r_r_r_r_r, s,^${1-.},," is a substitute command which tells sed to replace everything between the first two delimiters (a comma is used as the delimiter) with everything between the second. The initial ^ causes the match to be performed only at the beginning of the line. The expression_r_r_r_r_r that follows is, again, the starting directory reference, and the string between the second pair of delimiters is null. Thus, the requested directory name from the beginning of the output of sort is trimmed.
*
The second expression_r_r_r_r_r, /^$/d tells sed to delete all blank lines (lines with nothing between the beginning and the end).
*
The third expression_r_r_r_r_r is probably the trickiest. It used the ability to remember a string within a regular expression_r_r_r_r_r and then use it later. The expression_r_r_r_r_r s,[^/]*/\([^/]*\)$,\`-----\1, tells sed to replace the last two strings separated by a slash (/) with a backquote, five dashes and the last string (following the final slash).
*
Lastly, the final expression_r_r_r_r_r, -e "s,[^/]*/,| ,g" tells sed to replace every occurrence of strings that do not contain a slash but are followed by a slash, with a pipe (|) and six spaces.

Unless you are familiar with regular expression_r_r_r_r_rs you probably didn't follow all that. But you probably learned something and you can easily use dtree without having to understand how it works.
差不多就這些了,腳本是死的,大家可以對這個腳本按照自己的意圖進行修改,你會發現很多好玩的東西,哪怕只是改變了其中的一個字元,結果也會有所不同,這就是shell腳本的魅力所在。最後還是要感謝阿城同學。


最後,附上一本珍貴的經典書籍
——Advanced Bash-Scripting Guide_6.1.pdf

[火星人 ] 關於對dtree的詳細分析已經有409次圍觀

http://coctec.com/docs/linux/show-post-184416.html