Re: Are you all aware that there was a slight drop in performancein going from pre3 to pre5?

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Tue Apr 18 2000 - 13:32:40 EST


Also, minor correction. It was dput() that was oinking in 2.4 and not
d_lookup(). This is probably worse since dput() may trigger inode
updates in underlying file systems. It took a lot longer to copy the
files under 2.3.99 than 2.2.15.

:-)

Jeff

"Jeff V. Merkey" wrote:
>
> Alexander Viro wrote:
> >
> >
> > It's not necessarily bad. I'ld love to look at the whole picture - right
> > now the whole lookup mechanism is in the middle of a sequence of changes
> > and timings definitely _will_ change several times before it will settle
> > down. Jeff, care to email the results of profiling you've got?
> >
>
> Al,
>
> Here's the results for two runs of 165 MB of files copied from a NetWare
> Partition to Another NetWare Partition on 2.2.15 and 2.3.99 and another
> copy from an EXT2 partition to a NetWare partition (both copies were run
> concurrently for each Linux version). Both perform very well, but
> 2.2.15 is slightly faster on raw I/O. If you are doing stuff with
> cached files, 2.3.99 is faster (like just reading from page cache).
>
> You may see something that will indicate where the problem lies. I
> noticed a lot more paging activity on 2.4, so perhaps the extra page
> faults are causing some performance issues. Task Gates are very heavy
> on Intel systems, as are interrupts and exceptions.
>
> Jeff
>
> ------------------------------------------------------------------------
> 80100000 startup_32 4 0.0244
> 80108104 __switch_to 9 0.0479
> 80108214 sys_execve 1 0.0081
> 801093f4 system_call 50 0.8929
> 8010942c ret_from_sys_call 3 0.1765
> 8010943d ret_with_reschedule 3 0.1875
> 8010944d restore_all 22 1.4667
> 8010945c signal_return 1 0.0417
> 801094c4 ret_from_intr 2 0.0625
> 80109500 error_code 18 0.3462
> 80109624 page_fault 39 3.2500
> 8010aed0 synchronize_bh 6 0.0750
> 8010af20 synchronize_irq 32 1.6000
> 8010af34 __global_cli 6 0.0181
> 8010b558 probe_irq_on 38 0.1979
> 8010b618 probe_irq_off 7 0.0486
> 8010f788 smp_flush_tlb 6 0.0484
> 8010f834 smp_send_stop 2 0.0714
> 80110378 do_check_pgt_cache 1 0.0052
> 80110928 do_page_fault 26 0.0283
> 801110d0 flush_signal_handlers 1 0.0167
> 801112ec send_sig_info 1 0.0013
> 80111a40 sys_rt_sigprocmask 1 0.0016
> 801122b8 do_sigaction 3 0.0081
> 801128bc sys_rt_sigaction 2 0.0059
> 80112b58 reschedule_idle 4 0.0086
> 80112d28 wake_up_process 10 0.0926
> 80112da4 add_timer 1 0.0025
> 801130c0 del_timer 1 0.0114
> 801131e0 schedule 24 0.0251
> 8011359c __wake_up 10 0.1000
> 80113600 __up 3 0.0536
> 80113638 __down 15 0.0568
> 801138b8 interruptible_sleep_on 4 0.0238
> 80115438 mmput 1 0.0192
> 8011546c copy_files 4 0.0061
> 80115700 do_fork 6 0.0026
> 801167b8 printk 1 0.0025
> 80117e6c in_group_p 1 0.0167
> 80117ea8 sys_newuname 1 0.0049
> 80119234 qm_symbols 3 0.0070
> 801193e0 qm_info 1 0.0035
> 80119500 sys_query_module 1 0.0024
> 80119f50 release 2 0.0057
> 8011a0b0 session_of_pgrp 1 0.0119
> 8011a364 exit_notify 3 0.0046
> 8011a5ec do_exit 2 0.0028
> 8011a8cc sys_wait4 2 0.0021
> 8011bb50 do_bottom_half 2 0.0125
> 8011d968 clear_page_tables 5 0.0284
> 8011da70 new_page_tables 2 0.0111
> 8011db24 copy_page_range 15 0.0302
> 8011dd14 zap_page_range 18 0.0398
> 8011e3e4 put_dirty_page 2 0.0052
> 8011e568 do_wp_page 58 0.1133
> 8011e9d8 do_anonymous_page 299 2.3359
> 8011ea58 do_no_page 13 0.0533
> 8011eb4c handle_mm_fault 19 0.0420
> 8011ed80 vm_enough_memory 10 0.0806
> 8011edfc sys_brk 17 0.0590
> 8011ef1c do_mmap 16 0.0163
> 8011f2f4 get_unmapped_area 1 0.0089
> 8011f590 find_vma 12 0.1250
> 8011f5f0 find_vma_prev 10 0.0568
> 8011f6a0 unmap_fixup 2 0.0070
> 8011f7bc free_pgtables 1 0.0074
> 8011f844 do_munmap 6 0.0113
> 8011fb3c exit_mmap 2 0.0075
> 8011fc48 insert_vm_struct 8 0.0233
> 8011fda0 merge_segments 9 0.0245
> 8011ff8c truncate_inode_pages 1 0.0032
> 80120264 update_vm_cache_conditional 1 0.0031
> 801204d4 __wait_on_page 3 0.0099
> 80120604 do_generic_file_read 20 0.0132
> 80120bf0 file_read_actor 101 1.2024
> 80120c44 generic_file_read 1 0.0068
> 80120fe0 filemap_nopage 100 0.1220
> 80121314 filemap_write_page 1 0.0041
> 8012176c generic_file_mmap 1 0.0064
> 80122190 change_protection 2 0.0062
> 801222d4 mprotect_fixup 2 0.0032
> 8012400c kmem_cache_grow 2 0.0021
> 80124408 kmem_cache_alloc 24 0.0682
> 80124568 kmem_cache_free 19 0.0470
> 801246fc kmalloc 23 0.0564
> 80124894 kfree 18 0.0388
> 80125154 swap_out_vma 1 0.0026
> 80125b20 __free_page 36 0.1343
> 80125c2c free_pages 21 0.0603
> 80125d88 __get_free_pages 56 0.0824
> 80126670 free_page_and_swap_cache 1 0.0104
> 801266d0 lookup_swap_cache 5 0.0250
> 801280ec sys_access 2 0.0070
> 801284d4 sys_chmod 1 0.0052
> 8012880c filp_open 5 0.0205
> 80128900 get_unused_fd 6 0.0169
> 80128a64 sys_open 4 0.0238
> 80128b24 __fput 2 0.0278
> 80128b6c filp_close 3 0.0312
> 80128bcc sys_close 12 0.0882
> 80128d58 sys_lseek 1 0.0046
> 80128fe8 sys_read 16 0.0702
> 801290cc sys_write 14 0.0479
> 80129c50 insert_file_free 2 0.0500
> 80129c78 get_empty_filp 4 0.0132
> 80129e18 fput 1 0.0147
> 80129e90 __wait_on_buffer 10 0.0325
> 80129fc4 sync_buffers 1 0.0021
> 8012a3d4 remove_from_queues 2 0.0062
> 8012a594 insert_into_queues 3 0.0112
> 8012a6a0 find_buffer 7 0.0500
> 8012a74c get_hardblocksize 1 0.0278
> 8012a9e0 init_buffer 1 0.0167
> 8012aa48 getblk 3 0.0094
> 8012abc8 refile_buffer 1 0.0057
> 8012ac78 __brelse 5 0.0568
> 8012acfc bread 1 0.0096
> 8012aed8 put_unused_buffer_head 1 0.0132
> 8012af24 get_unused_buffer_head 4 0.0263
> 8012b308 brw_page 18 0.0205
> 8012b674 mark_buffer_uptodate 2 0.0227
> 8012b6cc generic_readpage 2 0.0147
> 8012ba0c sync_old_buffers 1 0.0022
> 8012bca4 bdflush 1 0.0014
> 8012e0e0 cp_new_stat 12 0.0306
> 8012e2f4 sys_newstat 1 0.0074
> 8012e528 sys_newfstat 8 0.0500
> 8012e7e4 sys_uselib 1 0.0050
> 8012e8e4 copy_strings 2 0.0045
> 8012ebac read_exec 1 0.0028
> 8012f844 pipe_write 4 0.0076
> 8012fe90 getname 4 0.0270
> 8012ff24 permission 8 0.0303
> 8013002c get_write_access 1 0.0312
> 80130058 reserved_lookup 5 0.0694
> 801300a0 cached_lookup 2 0.0250
> 801300f0 real_lookup 2 0.0128
> 8013018c do_follow_link 5 0.0379
> 80130210 lookup_dentry 34 0.0702
> 8013044c open_namei 11 0.0150
> 80131d24 sys_fcntl 1 0.0011
> 80132620 filldir 1 0.0081
> 8013269c sys_getdents 1 0.0036
> 8013290c max_select_fd 2 0.0104
> 801329cc do_select 3 0.0052
> 80132c10 sys_select 8 0.0056
> 80133cc8 locks_remove_posix 3 0.0214
> 80134c00 dput 16 0.0494
> 80134d44 d_invalidate 1 0.0114
> 80134d9c prune_dcache 1 0.0036
> 80135178 d_alloc 8 0.0238
> 80135330 d_lookup 44 0.2037
> 801354f8 d_rehash 2 0.0294
> 80135850 __mark_inode_dirty 3 0.0312
> 801358b0 __wait_on_inode 4 0.0196
> 8013597c sync_inodes 9 0.0321
> 80135c78 clear_inode 2 0.0238
> 80135e14 __free_inodes 1 0.0100
> 80136050 grow_inodes 1 0.0022
> 80136210 find_inode 27 0.5625
> 80136240 clean_inode 6 0.0577
> 801362a8 get_empty_inode 1 0.0057
> 80136358 get_new_inode 4 0.0135
> 8013653c iget 1 0.0083
> 80136634 iput 16 0.0284
> 80136868 bmap 1 0.0312
> 801368d4 update_atime 4 0.0400
> 80136a90 inode_change_ok 1 0.0027
> 80136c00 inode_setattr 1 0.0058
> 80136d54 is_bad_inode 1 0.0357
> 80137248 create_elf_tables 2 0.0025
> 8013792c load_elf_binary 2 0.0006
> 80139a4c load_aout_binary 1 0.0006
> 8013dc60 ext2_permission 10 0.0446
> 8013e02c ext2_free_blocks 2 0.0027
> 8013e310 ext2_new_block 5 0.0022
> 8013ec68 ext2_check_blocks_bitmap 1 0.0014
> 8013ef20 ext2_count_free 1 0.0125
> 8013ef78 ext2_check_dir_entry 1 0.0057
> 8013fad8 ext2_open_file 2 0.0500
> 8013ff24 load_inode_bitmap 1 0.0020
> 80140b20 ext2_alloc_block 4 0.0116
> 80140c78 ext2_bmap 1 0.0018
> 80140eac inode_getblk 4 0.0100
> 8014103c block_getblk 1 0.0016
> 8014129c ext2_getblk 5 0.0100
> 801415a0 ext2_read_inode 4 0.0041
> 80141968 ext2_update_inode 6 0.0072
> 80142080 ext2_find_entry 7 0.0097
> 80144e50 ext2_follow_link 1 0.0083
> 80145218 trunc_indirect 1 0.0015
> 8014638c proc_root_lookup 2 0.0067
> 8014a63c sock_alloc 1 0.0071
> 8014aba8 sock_wake_async 1 0.0114
> 8014cd94 alloc_skb 1 0.0042
> 8014ed70 net_bh 1 0.0020
> 80178444 unix_create1 1 0.0060
> 8017cb20 unplug_device 1 0.0119
> 8017cb74 __get_request_wait 35 0.0694
> 8017ce04 add_request 9 0.0133
> 8017d0a8 make_request 65 0.0437
> 8017d678 ll_rw_block 15 0.0288
> 8017e140 current_capacity 4 0.1111
> 8017ec1c ide_wait_stat 17 0.0708
> 8017ede0 start_request 10 0.0210
> 8017efe0 ide_do_request 84 0.1019
> 80180bac ide_delay_50ms 328 13.6667
> 80180bc4 ide_ioctl 124 0.0575
> 8018d870 try_to_identify 2 0.0018
> 8018f8c8 ide_build_dmatable 3 0.0109
> 8018fb78 ide_dmaproc 27 0.0742
> 80191f08 do_rw_disk 58 0.0751
> 80197560 scrup 19 0.0709
> 8019766c scrdown 9 0.0336
> 80197944 build_attr 2 0.0068
> 80197a68 update_attr 1 0.0061
> 80197c90 complement_pos 3 0.0103
> 80198024 add_softcursor 6 0.0294
> 801980f0 hide_cursor 2 0.0152
> 80198174 set_cursor 3 0.0234
> 801981f4 set_origin 1 0.0071
> 801983e8 vc_cons_allocated 1 0.0357
> 80198d74 csi_J 3 0.0076
> 80198f00 csi_K 2 0.0066
> 80199ed4 do_con_trol 16 0.0045
> 8019acb4 do_con_write 303 0.1672
> 8019b3c8 console_bh 1 0.0039
> 8019b96c con_put_char 3 0.0833
> 8019b990 con_write_room 1 0.0417
> 8019ba54 con_flush_chars 1 0.0417
> 8019c7e8 clear_selection 1 0.0139
> 8019c830 sel_loadlut 1 0.0125
> 8019ec88 tty_write 11 0.0212
> 8019fd84 tty_poll 1 0.0068
> 801a0558 tty_ioctl 1 0.0012
> 801a0f58 opost 5 0.0115
> 801a110c opost_block 13 0.0316
> 801a2754 read_chan 2 0.0011
> 801a2e38 write_chan 5 0.0082
> 801a3830 n_tty_ioctl 1 0.0008
> 801a8df8 conv_uni_to_pc 20 0.1190
> 801bb7c4 vgacon_build_attr 1 0.0061
> 801bb8d4 vgacon_set_cursor_size 2 0.0114
> 801bb984 vgacon_cursor 21 0.0520
> 801bc5cc vgacon_save_screen 1 0.0114
> 801bc624 vgacon_scroll 20 0.0388
> 801bc828 vgacon_dummy 1 0.1250
> 801c90b0 __down_failed_interruptible 1 0.0833
> 801c9104 __udelay 14 0.1522
> 801c9160 __const_udelay 2 0.0417
> 801c9190 __generic_copy_to_user 67 0.7283
> 801c91ec __generic_copy_from_user 63 1.0500
> 801c924c strncpy_from_user 6 0.0750
> 801c931c strnlen_user 1 0.0192
> 801c9350 __get_user_1 1 0.0500
> 801c93d0 __put_user_4 1 0.0417
> 00000000 total 3254 0.0039
>
> ------------------------------------------------------------------------
> 4 startup_32 0.0201
> 93076 default_idle 1789.9231
> 5 cpu_idle 0.0595
> 13 __switch_to 0.0637
> 1 sys_fork 0.0357
> 7 __down 0.0068
> 25 system_call 0.4464
> 1 ret_from_sys_call 0.0385
> 1 ret_with_reschedule 0.0625
> 9 restore_all 0.6429
> 9 error_code 0.1607
> 24 page_fault 2.0000
> 77 synchronize_irq 1.6042
> 4 __global_cli 0.0133
> 2 __global_save_flags 0.0294
> 11 __global_restore_flags 0.1618
> 58 handle_IRQ_event 0.4833
> 19 probe_irq_on 0.0257
> 1 old_mmap 0.0014
> 2 flush_tlb_others 0.0052
> 2 flush_tlb_mm 0.0143
> 5 flush_tlb_page 0.0347
> 1 flush_tlb_all_ipi 0.0119
> 1 smp_call_function 0.0033
> 7 get_pte_slow 0.0273
> 2 do_check_pgt_cache 0.0063
> 21 do_page_fault 0.0159
> 1 flush_signal_handlers 0.0132
> 1 sys_rt_sigprocmask 0.0014
> 5 do_sigaction 0.0098
> 2 sys_rt_sigaction 0.0093
> 88 schedule 0.0302
> 28 __wake_up 0.0217
> 1 free_uid 0.0042
> 3 mm_alloc 0.0069
> 2 copy_files 0.0021
> 4 do_fork 0.0013
> 2 do_syslog 0.0008
> 1 in_group_p 0.0250
> 1 in_egroup_p 0.0250
> 1 sys_newuname 0.0034
> 1 exit_notify 0.0012
> 2 sys_wait4 0.0013
> 1 sys_time 0.0278
> 4 do_softirq 0.0294
> 1 del_timer_sync 0.0041
> 3 check_pgt_cache 0.1250
> 12 clear_page_tables 0.0732
> 15 copy_page_range 0.0288
> 11 zap_page_range 0.0168
> 72 do_wp_page 0.0732
> 179 do_anonymous_page 0.7458
> 3 do_no_page 0.0144
> 12 handle_mm_fault 0.0221
> 3 vm_enough_memory 0.0500
> 8 sys_brk 0.0230
> 9 find_vma 0.0900
> 5 find_vma_prev 0.0284
> 3 do_munmap 0.0035
> 1 sys_munmap 0.0060
> 7 do_brk 0.0143
> 2 exit_mmap 0.0032
> 3 insert_vm_struct 0.0060
> 13 merge_segments 0.0190
> 1 __add_page_to_hash_queue 0.0147
> 1 truncate_inode_pages 0.0010
> 2 shrink_mmap 0.0014
> 1 do_buffer_fdatasync 0.0024
> 1 read_cluster_nonblocking 0.0022
> 1 ___wait_on_page 0.0011
> 14 __find_get_page 0.0122
> 7 do_generic_file_read 0.0039
> 51 file_read_actor 0.3750
> 1 generic_file_read 0.0078
> 53 filemap_nopage 0.0500
> 1 generic_file_write 0.0008
> 2 change_protection 0.0038
> 1 sys_mprotect 0.0017
> 1 get_vm_area 0.0057
> 6 kmem_cache_grow 0.0038
> 21 kmem_cache_alloc 0.0221
> 25 kmem_cache_free 0.0230
> 5 kmalloc 0.0050
> 1 kfree 0.0009
> 16 try_to_swap_out 0.0260
> 15 swap_out_vma 0.0344
> 4 swap_out 0.0067
> 1 kswapd 0.0023
> 27 __free_pages_ok 0.0336
> 68 rmqueue 0.0872
> 11 __alloc_pages 0.0724
> 1 nr_free_pages 0.0192
> 13 free_page_and_swap_cache 0.0903
> 1 lookup_swap_cache 0.0042
> 5 __get_swap_page 0.0058
> 1 acquire_swap_entry 0.0016
> 1 sys_chmod 0.0027
> 1 filp_open 0.0109
> 5 get_unused_fd 0.0116
> 5 sys_open 0.0084
> 2 filp_close 0.0208
> 7 do_close 0.0136
> 2 sys_llseek 0.0036
> 13 sys_read 0.0428
> 5 sys_write 0.0164
> 3 get_empty_filp 0.0038
> 2 __fput 0.0263
> 2 _fput 0.0048
> 1 put_filp 0.0048
> 32 __wait_on_buffer 0.0342
> 1 sync_buffers 0.0010
> 3 get_hash_table 0.0144
> 1 getblk 0.0029
> 1 breada 0.0028
> 2 get_unused_buffer_head 0.0030
> 2 set_bh_page 0.0192
> 2 __block_prepare_write 0.0038
> 1 __block_commit_write 0.0053
> 1 block_read_full_page 0.0016
> 3 try_to_free_buffers 0.0034
> 6 cp_new_stat 0.0181
> 1 sys_newstat 0.0031
> 5 sys_newfstat 0.0112
> 2 copy_strings 0.0037
> 1 put_dirty_page 0.0025
> 1 search_binary_handler 0.0014
> 1 do_execve 0.0020
> 3 pipe_wait 0.0036
> 5 pipe_read 0.0075
> 2 pipe_write 0.0021
> 5 getname 0.0179
> 9 permission 0.0326
> 1 get_write_access 0.0312
> 1 put_write_access 0.0833
> 1 cached_lookup 0.0119
> 20 walk_name 0.0167
> 4 walk_init 0.0312
> 2 __namei 0.0200
> 1 vfs_create 0.0037
> 6 open_namei 0.0039
> 1 do_unlink 0.0042
> 2 vfs_follow_link 0.0147
> 1 sys_fcntl 0.0013
> 1 vfs_readdir 0.0051
> 1 filldir 0.0078
> 1 sys_getdents 0.0022
> 1 free_wait 0.0025
> 3 locks_remove_posix 0.0183
> 2 locks_remove_flock 0.0125
> 14 dput 0.0372
> 8 d_alloc 0.0235
> 24 d_lookup 0.0984
> 3 d_rehash 0.0441
> 3 init_once 0.0060
> 2 __mark_inode_dirty 0.0091
> 1 __wait_on_inode 0.0015
> 2 sync_inodes 0.0027
> 1 clear_inode 0.0046
> 15 prune_icache 0.0391
> 2 find_inode 0.0227
> 7 clean_inode 0.0427
> 1 get_empty_inode 0.0036
> 4 get_new_inode 0.0074
> 3 iget4 0.0072
> 9 iput 0.0091
> 1 inode_change_ok 0.0029
> 1 is_bad_inode 0.0312
> 2 load_elf_binary 0.0005
> 1 ext2_get_group_desc 0.0071
> 7 ext2_get_block 0.0041
> 1 ext2_read_inode 0.0010
> 1 ext2_update_inode 0.0011
> 3 ext2_find_entry 0.0031
> 2 blk_get_queue 0.0094
> 93 generic_unplug_device 0.4844
> 50 generic_make_request 0.0198
> 12 __ll_rw_block 0.0270
> 4 tty_write 0.0038
> 1 init_dev 0.0007
> 1 release_dev 0.0004
> 2 tty_ioctl 0.0023
> 2 complement_pos 0.0070
> 2 set_cursor 0.0156
> 2 vc_cons_allocated 0.0625
> 54 do_con_write 0.0284
> 4 con_write 0.0909
> 1 con_flush_chars 0.0417
> 1 opost 0.0023
> 7 opost_block 0.0172
> 1 n_tty_receive_buf 0.0002
> 1 n_tty_open 0.0034
> 2 read_chan 0.0005
> 3 write_chan 0.0027
> 5 keyboard_interrupt 0.0278
> 5 ide_set_handler 0.0208
> 16 ide_wait_stat 0.0678
> 12 start_request 0.0229
> 15 ide_do_request 0.0186
> 2428 ide_intr 2.7717
> 445 ide_delay_50ms 15.8929
> 10 ide_ioctl 0.0045
> 37 do_rw_disk 0.0432
> 1 vgacon_switch 0.0147
> 1 sys_recvfrom 0.0036
> 2 __kfree_skb 0.0070
> 1 udp_v4_mcast_deliver 0.0026
> 81 __rdtsc_delay 2.8929
> 1 __delay 0.0250
> 84 __generic_copy_to_user 1.4000
> 37 __generic_copy_from_user 0.6167
> 1 __strncpy_from_user 0.0278
> 6 strncpy_from_user 0.0750
> 1 clear_user 0.0132
> 3 strnlen_user 0.0469
> 1 __get_user_1 0.0500
> 97999 total 0.0716

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:14 EST