Python3 port: why and how far are we?

Presenter Notes

Why

  • python2 is frozen there will never be python2.8
  • python2 will see no improvements
  • following python is good (with, iterators, metaclasses, …)
  • tryton is a good python citizen (pypi, tries to minimize NIH, …)

Presenter Notes

What changes

  • print is a function
  • iterators are used
  • float division
  • unicode changes
  • metaclass syntax
  • string formatting
  • inherting from BaseException is mandatory
  • super()

Presenter Notes

What is the scope of the migration

  • trytond
  • trytond modules
  • proteus
  • tryton not included (but maybe it should)

Presenter Notes

Migration strategy

  1. wait for the libraries to migrate and if necessary fix them
  2. use 2to3 to fix old syntax and stuff (166K patch!)
  3. run the server, launch the tests
  4. fix the errors
  5. repeat until everything is OK

Presenter Notes

What is the status of the libraries

Package Ready Patches Unavailable
lxml X    
python-dateutil X    
polib X    
(real) database connectors X    
Genshi   X(*)  
relatorio   X  
pywebdav   X  
LDAP     X

Most libraries are ported and those that are not have patches. One notable exception: python-ldap.

Presenter Notes

Code issue (I)

sql_format of fields.Char

     def sql_format(value):
         if value is None:
             return None
-        elif isinstance(value, str):
-            return str(value, 'utf-8')
-        assert isinstance(value, str)
         return value
  • this code ensure that we send only unicode to the database
  • could be removed because
    • client sends unicode
    • strings in the code are ASCII and thus correctly handled

Presenter Notes

Code issue (II)

some bytes vs buffer

             'minute': obj.minute,
             'second': obj.second,
             }
-    elif isinstance(obj, buffer):
+    elif isinstance(obj, bytes):
         return {'__class__': 'buffer',
-            'base64': base64.encodestring(obj),
+            'base64': base64.encodestring(obj).decode('utf-8'),
             }
     elif isinstance(obj, Decimal):
         return {'__class__': 'Decimal',

Presenter Notes

Code issue (III)

sqlite way to store NUMERIC

             select += ' OFFSET %d' % offset
         return select

-sqlite.register_converter('NUMERIC', lambda val: Decimal(val))
+sqlite.register_converter('NUMERIC', lambda val: Decimal(val.decode('utf-8')))
 if sys.version_info[0] == 2:
     sqlite.register_adapter(Decimal, lambda val: buffer(str(val)))
 else:
-    sqlite.register_adapter(Decimal, lambda val: bytes(str(val)))
+    sqlite.register_adapter(Decimal, lambda val: str(val).encode('utf-8'))

Should be OK since:

>>> x = Decimal()
>>> Decimal(bytes(str(x)).decode('utf-8')) == x

Presenter Notes

Code examples (I)

  • calls to decode

e.g. trytond/protocols/jsonrpc.py

         existing method through subclassing is the prefered means
         of changing method dispatch behavior.
         """
-        rawreq = json.loads(data, object_hook=object_hook)
+        rawreq = json.loads(data.decode('utf-8'), object_hook=object_hook)

         req_id = rawreq.get('id', 0)
         method = rawreq['method']

Presenter Notes

Code examples (II)

  • calls to encode

e.g. trytond/protocols/jsonrpc.py

             # report exception back to server
             response['error'] = (str(sys.exc_info()[1]), tb_s)

-        return json.dumps(response, cls=JSONEncoder)
+        return json.dumps(response, cls=JSONEncoder).encode('utf-8')

Presenter Notes

How far are we?

  • not so many issues
  • patch for PY3 compatibility is ready for trytond
  • modules migration: currently all modules up until stock are migrated
  • There is the python-ldap problem

Presenter Notes

When can we expect python3?

  • The main issue is to drop support for python2.6
  • python2.6 is 4 years old (first release on 2012-10-01)
  • python2.6 is in security-fix-only mode (will end next year)
  • still the default python on debian (will change with the upcoming release)
  • once we can switch to python2.7 we can move on
  • only big issue: python-ldap

Presenter Notes

Benchmarks

  • realized on version of septembre 21st
  • it does not mean much but gives an indication
  • realized with python3.2, python3.3 is faster and less greedy in memory
import timeit

def avg(lst):
    return sum(lst) / len(lst)

def std_dev(lst):
    return math.sqrt(sum((x - (sum(lst) / len(lst))) ** 2 for x in lst)
        / len(lst))

results = timeit.repeat('{}()'.format(sys.argv[1]), repeat=5, number=100,
    setup='from __main__ import {}'.format(sys.argv[1]))
print('{0:.6}s ± {1:.5}'.format(avg(results), std_dev(results)))

Presenter Notes

Running the tests

Test python2.7 python3.2
test_stock 28.5806s ± 0.3398 30.2421s ± 0.4267

This benchmarks mainly test the initialization of the database.

≃ 6% slower

Presenter Notes

Initializating the pool

def pool_initialization():
    dbname = 'db_py3'
    Pool.start()
    pool = Pool(dbname)

    with Transaction().start(dbname, 0):
        pool.init()
Test python2.7 python3.2
pool init 9.11617s ± 0.0768 9.35978s ± 0.0950

≃ 3% slower

Presenter Notes

A simple search

def test_module_list():
    dbname = 'db_py3'
    Pool.start()
    pool = Pool(dbname)

    with Transaction().start(dbname, 0, context={}):
        pool.init()
        module_obj = pool.get('ir.module.module')
        module_obj.search([('state', '=', 'installed')])
Test python2.7 python3.2
list modules 9.2981s ± 0.064182 9.75338s ± 0.06838

≃ 5% slower

Presenter Notes

Creating and validating shipments

shipments = []
for _ in range(30):
    ship_id = shipment_in_obj.create({
            'supplier': supplier,
            'company': company,
            'incoming_moves': [('create', {
                        'product': product,
                        'quantity': 50, 'uom': unit,
                        'from_location': supplier_location,
                        'to_location': input_location,
                        'company': company,
                        'unit_price': Decimal(4),
                        })]})
    shipments.append(ship_id)
received_ids = [x for x in shipments if x % 2]
done_ids = [x for x in received_ids if x % 3]
cancel_received_ids = [x for x in received_ids if not x % 3]
cancel_other_ids = [x for x in shipments if not x % 2]
shipment_in_obj.receive(received_ids)
shipment_in_obj.done(done_ids)
shipment_in_obj.cancel(cancel_received_ids)
shipment_in_obj.cancel(cancel_other_ids)
Test python2.7 python3.2
use stock move 2200.21s ± 15.273 2195.2s ± 70.102

≃ no difference

Presenter Notes