This story started about six months ago.That’s when I wrote my first meaningful module for Rails, Sourcebuster.And at the same time, I got an invite to the hbr for post about this module In fact, most of the theory is already laid out at the link, and I don’t want to copy an old one. Instead I suggest you to read the previous post before reading this one.
- The script defines parameters of the sources of site visitors and stores the data in cookies
- the logic of defining and rewriting the sources is exactly the same as in Google Analytics
- the script is completely autonomous in terms of data retrieval and has no dependencies on third-party stuff (like _utmz cookies)
- you can use the data you get :
- For phone number spoofing
- content spoofing on the website (e.g., headers)
- save with the forms sent from the site
- Export to your CRM or analytics systems.
Installation and adjustment
of the page, the earlier you get a cookie, the data from which can be used to manipulate DOM objects.
Inserting into pages :
Suitable for those who :
- does not use subdomains on the site
- counts only transfers from Yandex and Google as organic traffic (the rest is unimportant / error / insert your variant)
What you get out of the box
- By default, organic traffic is only considered as transitions from Yandex and Google.
- Duration of user session : 30 minutes.
- The default logic is that the site does not use subdomains. See below for a detailed explanation of configuring the script to work with subdomains.
- The default user ip is not saved.
Installation "for advanced PC users"
<script>var _sbjs = _sbjs || ;_sbjs.push(['_setSessionLength', 15]);_sbjs.push(['_setBaseHost', 'statica.alexfedoseev.com']);_sbjs.push(['_setTimeZoneOffset', 4]);_sbjs.push(['_addOrganicSource', 'yahoo.com', 'p']);_sbjs.push(['_addOrganicSource', 'bing.com', 'q', 'bing']);_sbjs.push(['_addReferralSource', 'facebook.com', 'social']);_sbjs.push(['_addReferralSource', 't.co', 'social', 'twitter.com']);_sbjs.push(['_addReferralSource', 'plus.url.google.com', 'social', 'plus.google.com']);</script><script src="/path/to/sourcebuster.min.js" id="sbjs"> </script>
You need to put your wishes into the settings array through
_sbjs.push and then load the main script right after that.
There are 7 types of user settings in total :
Let’s go in order.
Set the duration of a user session in minutes.
Within this module, this figure only affects overwriting/non-overwriting referralsource.
A few words about source overwriting. When a user comes to the site for the first time, we get the data about the source. The same user may return to the site from another source, and we either need to overwrite the current source or not. The overwriting logic is exactly the same as in Google Analytics:
- Transitions with utm markup overwrite anything and everything (even themselves).
- Transitions from organic output similarly – overwrite everything and always.
- Direct transitions do not overwrite everything ever and nothing. They are recorded only in the case of the very first visit to the site, provided that no other sources have been recorded before.
- Referral transitions within the current session does not overwrite anything, rewrite occurs only if the user has no session. Why – I will explain by example: often the visitor in the current visit goes to the site from a third-party resource, which is not the real source – for example, from the mail service, where he had a link to activate registration.
Set the base host, within which all transitions will be considered internal (not referral) traffic. This setting is only relevant if you use subdomains on your site.
Suppose you have a website : site.com . Your site has a blog : blog.site.com And you want the conversions from the site to the blog and back to be counted as internal traffic : i.e. source blog.site.com was not captured as referral and did not overwrite other sources in a new session. To do this, you need to add the line :
With this setting, if the user has switched from blog.site.com to site.com (as well as from alex.blog.site.com to site.com ), the source will not be overwritten, and such a jump would be equivalent to a jump from site.com/about to site.com/contacts
Now consider the opposite scenario: when you want to split traffic between subdomains and consider it as referral traffic. There is a main site ( site.com ) and there is a blog ( blog.site.com ), which has subdomains for users ( alex.blog.site.com ).You want transitions between blog.site.com and alex.blog.site.com count as internal traffic, and conversions between these subdomains and the main site as referral traffic. To do this :
// on the pages of the main site_sbjs.push(['_setBaseHost', 'site.com', false])// on the pages of the blog.site.com and alex.blog.site.comsubdomains_sbjs.push(['_setBaseHost', 'blog.site.com']);
Notice the third parameter
false in the setting for the main site. It is set if and only if non-referral traffic should be only within the specified domain. All other traffic, including conversions from subdomains within the same domain, will be considered referral traffic.
In our example, with this setting, all transitions between the main site and the blogs will be considered referral traffic. And if a user goes to the main site for the first time by clicking on a link from the user’s blog, then his source will be : alex.blog.site.com (traffic type : referral ).
Check again that you got it right when you use the parameter
Domain of the page where the setting is installed
_setBaseHost with the parameter
false must match the host specified in this setting.
// Correct: on pages site.com_sbjs.push(['_setBaseHost', 'site.com ', false]);// DOESN 'T MAKE SENSE : on pages blog.site.com_sbjs.push(['_setBaseHost', 'site.com ', false]);
The specified host does not have any subdomains from which you want traffic to be considered non-referral.
_sbjs.push(['_setBaseHost', 'site.com', false]);//=> traffic from ALL subdomains to site.com will be referrals
Set the time zone.
By default, the date is saved in UTC. The setting allows you to change the default time zone.
Set the GET parameter whose value will be saved as
utm_campaign (if there is no original parameter in the request
utm_campaign ). This setting was added mainly because of the Google AdWords tag
Example of usage
If you have traffic from Google AdWords and you use the
gclid , you can shorten the urls by removing the utm markup. Sourcebuster will still determine that it is utm traffic from Google AdWords.
If the url contains only the label
This will give the following result :
- Traffic type: utm
- utm_source: google
- utm_medium: cpc
- utm_campaign: google_cpc
- utm_content: (none)
- utm_term: (none)
You can change the value of
Then the result will be :
- Traffic type:utm
- utm_source: google
- utm_medium: cpc
- utm_campaign: test_custom
- utm_content: (none)
- utm_term: (none)
- If the url contains the original utmtags (utm_source, utm_medium, utm_campaign), the tag
gclidand the parameter specified via
_setCampaignParamwill be ignored.
- If the url only contains the parameter specified with
_setCampaignParam, Sourcebuster will consider this transition as utm traffic.
_sbjs.push(['_addOrganicSource', 'yahoo.com', 'p']);_sbjs.push(['_addOrganicSource', 'bing.com', 'q', 'bing']);
Adding a source of organictraffic.
Suppose you want the system to count the jumps from the search bing.com – as organic traffic.To do this you need to add a basic host –
'bing.com'. , and the keyword parameter is
'q' Both parameters are mandatory.You can also specify an alias for the source via an optional third parameter (
To get the keyword parameter, you need to go to bing.com. and type a query into the search box (e.g, "apple" ).This will take you to a page with an address like :
www.bing.com/search ? q=apple go=qs=nform=QBLHpq=applesc=8-5sp=-1sk=cvid=718ad07527244c319ecebf44aa261f64
Keyword parameter –
'q' – is a character between "?" (or "" if the parameter is not the first one after a question mark) and "=apple" in the search results page url.
_sbjs.push(['_addReferralSource', 'facebook.com', 'social']);_sbjs.push(['_addReferralSource', 't.co', 'social', 'twitter.com']);
Add the source of referraltraffic.In general, if you are satisfied with
utm_medium when going from e.g. facebook.com will have the value
referral then you don’t need to set anything.But if you want to assign such transitions a custom channel (e.g,
utm_medium=social ), you can add such a setting via
_addReferralSource The first parameter is the base host, the second is the desired value
In addition, some resources have a different referrer than the main domain (for example, Twitter has a host referrer – t.co ). In such cases, you can use the optional third parameter to assign aliases to sources. You can also use it to group multiple sites with different referrers into one source.
_sbjs.push(['_setUserIP', <%= request.remote_ip %> ]);
By default, the script does not save the visitor’s ip address. If you want to save it in a custom cookie, you can add it via
_setUserIP by getting it on the backend. The example shows how to do this in Ruby.
So, the script is installed and configured. Now visitors have the following cookies when they go to the site :
Parameters of the outermost transition source.
If the user has changed the transition source (1-2-3-many times), this cookie will have the most outermost source.
Traffic type. Possible values :
typeinThere is no fifth.
Source. In fact , the value
utm_mediumCan be configured through utm markup and
Advertising Campaign. Value
The version of the banner ad. Value
Key query. Value
Examples of content
# switch from the marked advertisementtyp=utm|src=yandex|mdm=cpc|cmp=my_adv_campaign|cnt=banner_1|trm=buy_my_stuff# switch from organicstyp=organic|src=google|mdm=organic|cmp=(none)|cnt=(none)|trm=(none)# referral from a third-party sitetyp=referral|src=site.com|mdm=referral|cmp=(none)|cnt=(none)|trm=(none)# conversion from facebook with _addReferralSourcetyp=referral|src=facebook.com|mdm=social|cmp=(none)|cnt=(none)|trm=(none)# direct linktyp=typein|src=typein|mdm=typein|cmp=(none)|cnt=(none)|trm=(none)
Cook’s composition is exactly the same
sbjs_current but it stores the parameters of the very first visitor’s source. This cookie is set once and is never overwritten.
Additional data about the first visit of the user : date/time and login point.
Date and time of the very first visit to the site by a particular user. Saves in the format
yyyyy-mm-dd hh:mm:ssThe default is UTC. You can change the time zone via
Site entry point.
Cookie-flag that the user has an open session. Lifetime : 30 minutes or your setting via
_setSessionLength. (since last activity).
The referer at which the source was written or overwritten.
The page of the "third-party" site from which the visitor came to the site.
Additional data about the user : ip and user-agent.
uip=220.127.116.11|uag=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36
Current ip address of the visitor.
The current user-agent (browser) of the visitor.
Data is available via
get_sbjs.name_cookie_without_prefix_sbjs_.name_parameter_in_cookie// for example to get the current utm_sourceget_sbjs.current.src// the first utm_mediumget_sbjs.first.mdm// entry pointget_sbjs.first_add.ep// user-agent of the userget_sbjs.udata.uag// etc.
And of course you can parse the cookies yourself on the back.
Let’s display the visitor’s current source on the page. To do this, we need the variable
get_sbjs was defined. If, however, at the moment of calling the function that outputs the current source,
get_sbjs has not yet had time to load, then we should monitor the script loading (sourcebuster) and only then run the function which outputs the current source. Also for IE8 and below we need a little fix to run this function at the right time (after the sourcebuster is loaded).
Transitions from httpsto http
The standard for protocol transitions is
http there is no referrer in the request, and such transfers will be defined by the module as
typein (i.e. direct hits).
The "|" symbol in the utm markup
If you use it, it is most likely
get_sbjs will not work correctly.Sorry e.
I’m just starting to test the module on live projects. If anyone wants to join, you are welcome. It’s good if there is a working solution, with which you can compare the results. If you catch a bug, it would be better to report it in the format issue on Github
That’s it for me, thanks for your attention and good luck.