PeterUK

Members
  • members_posts

    381
  • Joined

  • Last visited


4 Followers



About PeterUK

  • Rank
    Advanced Member

IPS Marketplace

  • Resources Contributor Total file submissions: 1

Profile Information

  • Gender Male

PeterUK's Activity

  1. PeterUK added a post in a topic: JavaScript Password Hashing   


    It's faster than comparitively it used to be, it's not faster than raw HTTP, but it isn't far off (excluding negotiating the initial connection). If you're talking about people using SPDY rather than just plain HTTPS, then that's a different matter as well.

    That being said, I am not advocating against using HTTPS all the time, I think that would be fine.
  2. PeterUK added a comment on a blog entry: Get Ready For IPS 4.0!   

    It heavily depends on the size of your actual posts as well.  Larger ones obviously take longer to convert.  We have around 5 million and a beefy server and the normal script for us would have been more than 48 hours of solid run time (I stopped running it after about 24 hours straight).
  3. PeterUK added a comment on a blog entry: Get Ready For IPS 4.0!   

     
    If you have a large board it may be worth checking out my converter for doing this.  I should note that it's completely unofficial but has positive reviews from those who have tried it and can leverage parallel processing on your linux server to speed things up.
     
    http://community.invisionpower.com/files/file/6758-unofficial-multi-threaded-utf-8-converter/
  4. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Yes in your config you should set that to "utf8", although I believe it's supposed to default to that anyway. There should definitely be an error in your log due to the fact that IPB is reporting an error. You should try to load the website, then check cache/sql_error_latest.cgi or whatever it's called.
  5. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   


    As xtech says, please check your SQL errors and let us know what the problem query is. May be able to work out what happened from that.
  6. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Updated the script to use a superior library amongst other things.
  7. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Ah yes, I didn't realise this had been released publicly, I tested it some time ago before it was released. You should indeed use that one if you don't want multi-threading.
  8. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Sorry for the slow response, you are correct, Linux is a requirement for this script. It would be possible to make it run single threaded but ultimately you may as well just run bfarber's script, I am about to issue a patched version of this though which uses a superior third party library for UTF-8 encoding.
  9. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   


    Well remember the collation is irrelevant to the character set. It's no problem having a mix of utf8_general_ci and utf8_unicode_ci, it's all still utf8, that just affects how MySQL does sorting on the data. The problem comes when you have what you describe, which is characters in one character set but their character set in the DB does not match this. It sounds like if you have all latin1 characters you need to set that as your "from" character set, and utf8 as your "to" character set and see if it converts correctly.
  10. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   


    It sounds like it, because by default utf8_general_ci isn't mentioned anywhere in the file, by default it creates tables with utf8_unicode_ci collation.

    Can you issue a
    SHOW CREATE TABLE `table_name`; for one of the tables in question and paste the output? Perhaps your server is putting the collation within the table definition which the script does not override.

    You can also do
    SHOW COLLATION LIKE 'utf8%'; and if utf8_general_ci is listed as default for you then that's most likely why.
  11. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    That looks like a conversion problem somewhere before the query hits the DB. Is this a problem *after* the conversion? As for the creation of skipped tables, by default it should skip over the session data but create the table itself.
  12. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Your question is a bit confusing. You say that you changed to utf8 which broke some of your older characters, but then later on you say your database is in latin1. Are you saying you just changed the character set of your tables from latin1 to utf8 (and didn't convert any data?), so new posts were inserted in utf8 but older posts are still in latin1? If so, you could give this a try (using latin1as the input character set), and see what it comes out with, I honestly don't know how it would handle that. Another thing you could do is specify utf8 as the input and output character set, and use the mappings to convert characters which you know are problematic. I could patch the script so that if the input charset equals the output charset, it only performs replacements and doesn't attempt the (slower) character conversion.
  13. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    It's understood that IPB4 will require users to convert to UTF-8 first, yes. However this encoder is aimed at people where the time taken for that process is unacceptable and they need something faster.

    You can use the script if you want, however it will take what you give as your current character set, and convert it to the new one, if you tell it your current characters are UTF-8, and ask it to convert to UTF-8 I doubt anything will happen other than your time will be wasted. :P
  14. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    Fixed a small typo in the post-conversion replacement section of the code and backdated the upload since it didn't really warrant a new version number.
  15. PeterUK added a post in a topic: Unofficial Multi-Threaded UTF-8 Converter   

    File Name : Unofficial Multi-Threaded UTF-8 Converter
    File Submitter : PeterUK
    File Submitted : 20 Oct 2013
    File Category : Maintenance
    Supported Versions : IP.Board 3.1.x, IP.Board 3.2.x, IP.Board 3.3.x, IP.Board 3.4.x



    With the release of IPB 4 coming in the future, one of the things that is going to hit people with large communities hard is the conversion to UTF-8 if you are not already there. This can be a huge task, and with bfarber's converter , this works fine, but can be very slow as PHP can only run so quickly as it's bound by a single CPU thread. On my forum, which is a total DB size of ~10GB, bfarber's single threaded script takes over 48 hours straight to run. I found this amount of downtime for my live forum to be unacceptable, so I created a fork of his script and used PHP's fork functions to multi-thread it. With 8 cores assigned to the script, I was able to convert this database in just 5 hours under testing. When I did my live conversion, I assigned 12 cores and was able to do it in 3 hours.

    This script is aimed at people with large databases, where a conversion would normally take a significant amount of time, and people who are power users and understand server maintenance and configuration.

    Please ensure you carefully read the included readme.txt file, and the documentation in the PHP file for the settings.

    The database I converted was an IPB 3.4.5 database, but since this is based on bfarber's script from 2010, there's no reason this shouldn't work on pretty much any IPB version in the 3.x.x series, but I will only be officially supporting it on 3.4.5.

    This converter has the following enhancements over the original:
    Multi-threading Use of PDO directly for DB connections, which gives a small improvement over using IPB's library Use of PDO prepared statements for multiple inserts meaning less data and less CPU time are used by MySQL Ability to remap characters post-conversion Easy to read exceptions thrown to STDERR on query failure Useful ETA information based on the records already processed



    here to download this file

About Me

Status Feed