Lines of code and word counts with xargs and wc
The other day I finished a new project and was interested in how many lines of code I'd written. I wanted find each file recursively within a parent directory, count the lines in each file, then get a total once all files were found and tallied. After a bit of searching, I found that piping a list of files to xargs and using wc to count lines or words was a good way to quickly get an approximate count.
How does xargs work? From the manual page (man xargs
):
xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial-arguments followed by items read from standard input. Blank lines on the standard input are ignored.
One possible pitfall of xargs, from the manual, is that blanks or newlines in filenames won't be processed correctly. Luckily, I rarely save filenames with blanks or newlines, but this is good to know.
Because Unix filenames can contain blanks and newlines, this default behaviour is often problematic; filenames containing blanks and/or newlines are incorrectly processed by xargs. In these situations it is better to use the -0 option, which prevents such problems. When using this option you will need to ensure that the program which produces the input for xargs also uses a null character as a separator. If that program is GNU find for example, the -print0 option does this for you.
And how does wc work? Again, from the manual page (man wc
):
Print newline, word, and byte counts for each FILE, and a total line if more than one FILE is specified. A word is a non-zero-length sequence of characters delimited by white space.
The options below may be used to select which counts are printed, always in the following order: newline, word, character, byte, maximum line length.
-c, --bytes print the byte counts
-m, --chars print the character counts
-l, --lines print the newline counts
-w, --words print the word counts
Counting lines
To count lines of code (or, more accurately, newlines in files within a directory) what I have to do is do a find for files that I want to count. In this particular case, I wanted files with both .js and .jsx extensions. Then, I pipe those filenames into xargs and run wc -l to count the lines of each file.
find [directory] -name '*.extension' | xargs wc -l
As an example, I'll list out the line counts for the components, templates, and pages that comprise this blog:
find ~/gatsby-blog/src/ -name '*.js*' | xargs wc -l
92 /home/maryknize/gatsby-blog/src/pages/photos.js
16 /home/maryknize/gatsby-blog/src/pages/404.js
176 /home/maryknize/gatsby-blog/src/pages/art.js
72 /home/maryknize/gatsby-blog/src/pages/projects.js
113 /home/maryknize/gatsby-blog/src/pages/blog.js
286 /home/maryknize/gatsby-blog/src/pages/index.js
44 /home/maryknize/gatsby-blog/src/pages/microblog.js
17 /home/maryknize/gatsby-blog/src/pages/about.js
106 /home/maryknize/gatsby-blog/src/components/header.js
54 /home/maryknize/gatsby-blog/src/components/layout.js
101 /home/maryknize/gatsby-blog/src/components/seo.js
52 /home/maryknize/gatsby-blog/src/components/duolingo.js
31 /home/maryknize/gatsby-blog/src/components/microblog.js
104 /home/maryknize/gatsby-blog/src/templates/blog-post.js
50 /home/maryknize/gatsby-blog/src/templates/image-page.js
81 /home/maryknize/gatsby-blog/src/templates/photo-page.js
1395 total
I really like that every file is listed with line counts individually and there's a total once all files have been counted. If one file had a much larger line count than the others, I would consider splitting it into some smaller components. In this case, I see that my index.js file is much longer than the rest of the files listed, and the art page is also larger. I actually know that there's a component that I've created on both pages that I should abstract out to its own file. I also know that index.js has a large about of commented lines from things I've tried in the past. Once I finish those two tasks, both files should be much smaller.
Counting words
For fun, I thought I'd also count the words I've written so far on this blog. To do that, I replace the -l flag with -w. find [directory] -name "*.extension" | xargs wc -w
find .. -name '*.md' | xargs wc -w
1208 ../books/book_review_the_good_neighbor_fred_rogers_king.md
836 ../books/book_review_the_advantage_lencioni.md
178 ../programming/spotify_mini_from_the_command_line.md
1171 ../programming/a_static_blog_generator_from_scratch.md
808 ../programming/creating_a_fitbit_watch_face.md
220 ../programming/updating_python_on_a_raspberry_pi.md
78 ../programming/lines_of_code_and_word_count_with_xargs.md
510 ../programming/lessons_learned_from_a_large_project.md
307 ../programming/optional_chaining_and_nullish_coalescing_in_javascript.md
965 ../programming/useful-git-commands.md
316 ../programming/upcoming_project_hacking_fans_and_beds.md
1655 ../programming/creating_a_rot13_encoder_decoder_in_rust.md
416 ../programming/intentionally_useless_websites.md
1503 ../programming/creating_a_duolingo_widget.md
1266 ../programming/new_project_thinkery_markdown_microblog_laravel.md
712 ../programming/ubuntu_disco_dingo_driver_drama.md
249 ../programming/custom_domains_with_github_pages_and_namecheap.md
396 ../programming/domain_redirects_with_netlify_and_gatsby.md
2703 ../programming/server_side_rendering_and_api_calls_in_rust.md
928 ../programming/gatsby-image-annotations-using-exif-data.md
1162 ../recipes/recipe_half_batch_macarons.md
1444 ../recipes/recipe_marys_famous_vegetarian_biscuits_and_gravy.md
722 ../recipes/recipe_easy_one_pot_french_style_scrambled_eggs.md
892 ../recipes/recipe_lazy_sweet_potato_squash_bean_enchiladas.md
449 ../recipes/recipe_lentil_mushroom_coconut_curry_risotto.md
482 ../recipes/recipe_fishless_tacos_with_mango_salsa.md
1106 ../home_automation/running_home_assistant_from_an_external_hard_drive.md
1297 ../home_automation/quick_and_dirty_fan_hacking_with_raspberry_pi.md
316 ../life/so_much_to_update.md
1230 ../life/spreadsheets_for_everything.md
434 ../life/air_conditioning_float_switches_for_dumb_homeowners.md
661 ../life/programming_in_portrait_with_a_monitor_stand.md
490 ../life/learning_languages_for_fun.md
341 ../art/creating_an_animated_gif_on_the_command_line.md
481 ../art/glitch_art_with_sox_imagemagick_and_vim.md
637 ../art/the_allure_of_the_alternate_reality_game.md
1304 ../advent_of_code/advent_of_code_2020_day_7.md
557 ../advent_of_code/advent_of_code_2020_day_6.md
913 ../advent_of_code/advent_of_code_2020_day_9.md
894 ../advent_of_code/advent_of_code_2020_day_8.md
876 ../advent_of_code/advent_of_code_2020_day_3.md
1508 ../advent_of_code/advent_of_code_2020_day_4.md
1080 ../advent_of_code/advent_of_code_2020_day_1.md
947 ../advent_of_code/advent_of_code_2020_day_2.md
838 ../advent_of_code/advent_of_code_2020_day_5.md
495 ../web_accessibility/wcag_guideline_1_3_adaptable.md
728 ../web_accessibility/wcag_guideline_1_1_text_alternatives.md
459 ../writing/transitioning_from_a_blog_to_a_digital_garden.md
733 ../writing/i_write_every_day_but_you_dont_see_it.md
39901 total