I wrote previously about how we test the internationalization of our website in Testing internationalization language files. Basically, we generate a blank language file with all of the values for all of the labels set to blank. We switch the site to this language, and then we spider the site looking for text.
Over the past couple of months, we have improved our internationalization test and removed some of the existing limitations.
Manually marking nonlocalizable content
One of the limitations of the approach detailed in the previous article is that we had to manually mark content on the page that should not be internationalized by adding a class to the html:
<span class="nonlocalizable"><%= @building.address %></span>
The basis of our new test is the idea that all text on the page is one of two types:
- Labels and static text that live in the language files, which are inserted into the page using the GLoc method l()
- Text that the application produces, which should be html escaped using the h() method in the views or helpers
Therefore, if we intercept both of these types of text, we can find anything that is not localized or escaped.
Our new test setup looks like:
def setup blank_out_localization blank_out_html_escape end def blank_out_localization GLoc::InstanceMethods.class_eval do alias :old_l :l def l(symbol, *arguments) "" end end end def blank_out_html_escape ERB::Util.class_eval do alias :old_html_escape :html_escape def html_escape(s) "" end alias :h :html_escape end end
We redefine the l() method to return an empty string, so anything that is localized will no longer show up on the page.
The h() or html_escape() methods are used to escape strings for the web (for example, converting ‘<’ into ’<’). We also redefine these methods to return empty strings. Now, all text on the webpage should be blanked out.
We then spider the site as before, which walks every page and checks for non blank text.
It is possible to restore the l() and h() methods in the teardown:
def teardown restore_html_escape restore_localization end def restore_html_escape ERB::Util.class_eval do alias :html_escape :old_html_escape end end def restore_localization GLoc::InstanceMethods.class_eval do alias :l :old_l end end
However, I think it is safer to run this test in its own test suite in a separate ruby process. That way, the l() and h() monkey patching cannot accidentally affect other tests:
namespace :test do Rake::TestTask.new(:'internationalization' => ["environment", "load_test_data"]) do |t| t.libs << "test" t.pattern = "test/acceptance/internationalization_test.rb" t.verbose = true end Rake::TestTask.new(:'acceptance' => ["environment", "load_test_data"]) do |t| t.libs << "test" t.pattern = FileList["test/acceptance/**/*_test.rb"].exclude("test/acceptance/internationalization_test.rb") t.verbose = true end end
Now, we no longer need to mark any content as nonlocalizable. If the test fails, we either forgot to add a label to the language file, or we forgot to escape the text in the page:
<%= l(:name_label) %> or <%= h(@building.address) %>
Redirects
We noticed that Rails would send redirects as:
<html><body>You are being <a href="http://www.example.com/some/new/location">redirected</a>.</body></html>
The http://www.example.com URL was tripping up SpiderTest, so we removed that part of each URL. Furthermore, we skip our page checking on redirect pages and assets:
def consume_page(html, url) html.gsub!("http://www.example.com", "") unless redirect?(html) || asset?(url) assert_page_has_been_moved_to_language_file(html, url) super end def redirect?(html) html.include?("<body>You are being") end def asset?(url) File.exist?(File.expand_path("#{RAILS_ROOT}/public/#{url}")) end
Alt and title attributes
We discovered with the original test that we were not testing alt and title attributes on the page. For example, if you hover over a link, it will show the title. We also want these strings internationalized, so we added them to the test with the following code:
assert_attribute_does_not_contain_words body, url, 'title' assert_attribute_does_not_contain_words body, url, 'alt' def assert_attribute_does_not_contain_words body, url, attribute body.search("//*[@#{attribute}]") do |element| assert_does_not_contain_words element.get_attribute(attribute), url end end
Better error messages
We noticed that if you accidentally forget to internationalize a string like “Please enter your username,” the test would fail with a message of “Found text that was not in the language file: Please.” We thought it would be better to show the full string, so we replaced the regex:
/\w+/
with
/[A-Za-z]([A-Za-z]| )*/
The second one matches all word characters or spaces, so it will pick up the entire phrase.
Final result
The final test looks like:
require 'hpricot' class InternationalizationText < ActionController::IntegrationTest include Caboose::SpiderIntegrator def setup blank_out_localization blank_out_html_escape end def blank_out_localization GLoc::InstanceMethods.class_eval do alias :old_l :l def l(symbol, *arguments) "" end end end def blank_out_html_escape ERB::Util.class_eval do alias :old_html_escape :html_escape def html_escape(s) "" end alias :h :html_escape end end def test_all_text_has_been_moved_to_language_file get '/' assert_response :success spider(@response.body, '/', :verbose => true) end def consume_page(html, url) html.gsub!("http://www.example.com", "") unless redirect?(html) || asset?(url) assert_page_has_been_moved_to_language_file(html, url) super end def redirect?(html) html.include?("<body>You are being") end def asset?(url) File.exist?(File.expand_path("#{RAILS_ROOT}/public/#{url}")) end def assert_page_has_been_moved_to_language_file(page_text, url) doc = Hpricot.parse(page_text) assert_does_not_contain_words doc.at("title").inner_text, url body = doc.at('body') (body.search("//script[@type='text/javascript']")).remove assert_does_not_contain_words(body.inner_text, url) assert_attribute_does_not_contain_words body, url, 'title' assert_attribute_does_not_contain_words body, url, 'alt' end def assert_attribute_does_not_contain_words body, url, attribute body.search("//*[@#{attribute}]") do |element| assert_does_not_contain_words element.get_attribute(attribute), url end end def assert_does_not_contain_words text, url match = text.match(/[A-Za-z]([A-Za-z]| )*/) fail "Found text that was not in the language file: #{match[0].inspect} on #{url}" if match end end
These modifications have improved the quality of the internationalization test, and this test has been very useful at catching text that we forget to internationalize.