I'm still very busy here but what I can say for now is that there is an quite strange behavior in user profile pages, as you could see here: http://www.google.com/search?q=inurl:"community.invisionpower.com/user/94759-paulo+-freitas"&filter=0
I don't know where they're coming out, but the results with "page__f__" seems a bit buggy and probably dangerous for SEO. Can you take a look on it? :)
I'm running against time to have more time to analyze these things, I think we still can optimize a lot more. :D
I didn't know precisely, but I think that even returning 403 errors crawlers will waste your traffic unnecessarily. Blocking URLs in robots.txt avoids compliant crawlers to follow these addresses. I can imagine how much this would cost to huge traffic boards. :ermm:
I want to start here a discussion in how we can optimize our default robots.txt file to get it updated in future IP.Board versions. :)
It's simple: share your experiences. Use services like Google Webmaster Tools, Bing Webmaster Center and Yahoo! Site Explorer to identify wich pages is getting duplicated or is throwing errors/problems in these the crawlers.
From what I already detected, (if I'm not wrong) we can block 5 more URLs:
I still haven't put these lines in my own robots.txt but I tested them in Google Webmaster Tools (GMT) and I'm convicted that will have positive impact to 1) reduce useless indexed pages, 2) reduce duplicated content and 3) reduce HTML suggestions from GMT.
For the record: Crawlers hates this: http://www.google.com/search?q=intitle%3A%22Board+Message%22+%22An+Error+Occurred%22+site%3Acommunity.invisionpower.com&filter=0
All these duplicated and useless pages have negative impact to our rank in rigorous crawlers (like Google). We need to block them to reduce our penalty. :(
I'm still analysing my GMT reports and I'll update here with all new useless URLs I find. But I want more people involved to share your knowledge. :)
Sorry if my english is not perfect, I still need to dedicate more time to learn it. :huh:
Ok, as we are talking about signatures, while I was editing some settings in my local board I noticed that System Settings > Members > Members Profiles has another 3 signature settings (max_sig_length/sig_allow_html/sig_allow_ibc_yes). Shouldn't all signature settings be configured in the same place for consistency 'n (perhaps) usability?
I suggested this but in a big feature request topic that evolved very fast and then it probably became forgotten. :ermm:
For the record, we already have good settings for them:
But we lack a very important one: filesize limit for images.
You dev guys probably know that an 1x1px picture could be bigger than an 1920x1080px in filesize, right? We just need to explore a certain header and... Well, I don't need to go into details, right? I can see problematic users doing this and thus it's not what we want, right? :)
I can remember IP.Board 2.x days when we got used to use Signatured Limitation mod that had this setting among many others. Do you remember? :)
Ok, you guys are talking about image dimensions. But how about file size limit? You know that we could have for example a PNG with 1x1 px dimensions with a lot of megabytes in it's file size, right? :whistle: